« Rails book | Main | heh »

Math is hard.

Following in the footsteps of many wannabe geeks, I have been playing around with machine-learning algorithms for the Netflix Prize. The group that improves Netflix's recommendation algorithm (Cinematch) by 10% wins a cool million. I'm really impressed with some of these teams -- the top competitor has already beaten Cinematch by 5.77%.

My results have been pretty dismal, but not altogether unexpected. This is probably due to the fact that undergrad prob & stats is the limit of my math expertise. Also, I am learning linear algebra as I go. This is in contrast to the leading teams, who have multiple Ph.D.s working full-time on this.

I developed a correlation algorithm for item-based filtering, and it's achieving an RMSE (root mean squared error) of around 1.04. This is only slightly better than just predicting the average score for each movie, and it has a long way to go before it catches up with Cinematch (at 0.95; lower is better). I have a few tweaks in mind, but my cycle time is too high (I'm precomputing correlation tables, so every time I tweak I have something like 8 hours of table computation).

I think I give up.

TrackBack

TrackBack URL for this entry:
http://www.bradediger.com/blog/mt-tb.cgi/3

Comments (1)

John:

Heh, I wondered if this was something you'd be interested in, but never asked. Should have known you'd try it. Eh, give you a month solid and you'd be at 15% improvement ;)

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About

This page contains a single entry from the blog posted on November 27, 2006 2:23 PM.

The previous post in this blog was Rails book.

The next post in this blog is heh.

Many more can be found on the main index page or by looking through the archives.