tl;dr I let an algorithm observe me play 4 levels, and this is an example of how well it performs on a completely different level.
There's a recurring Mario AI contest for which they've written a decent framework for training AI players (or playing the game yourself) on randomly generated levels. marioai.org
I wrote a recorder that stores a description of what's on the screen in each frame, and which buttons I pressed that frame. Then I experimented with various Java machine learning libraries and algorithms for mapping frames to key presses, but most of them produced Marios that just stood in place, or always ran to the right, and in general just didn't react very intelligently to their environment.
On a whim, I used the Stanford Natural Language Processing Group's log-linear classifier on the data, treating each visible block and enemy as a "word" in a "document", and the set of pressed keys as the "topic" of that document. It works surprisingly well. nlp.stanford.edu/software/classifier.shtml
This video shows the AI after training on only 4 levels (and this level was not one of them). It doesn't seem to get much better with more data, but now I'm experimenting with different features, and different ways of combining models, since the algorithm makes some pretty strong assumptions about how different parts of the screen relate to one another.