Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This post is spot on. At my company, the amount of amazing stuff we can build basically had the top blow off once we were finally able to access our dataset using map reduce. In an afternoon, we can build something that previously was inconceivable due to the large data transformations necessary to build it.


I love to hear this. What are you working on and can you share an example or two of some of the new stuff you are able to do?


I'm at Etsy, we have a new experimental search tool up at:

http://www.etsy.com/explorer

It's server is in clojure but is powered by a massive tag analysis job using cascading. I'll be writing a blog post about this in due time once it matures.

Our "suggested shops" feature is now powered by elastic map reduce and matlab to perform matrix factorization, believe it or not. And, we're working on improved search algorithms. Being able to re-index the whole site in 15 minutes makes iterating and improving the algorithm quickly more possible.

More on our setup here:

http://codeascraft.etsy.com/2010/02/24/analyzing-etsys-data-...

We're going to be writing a lot more about this, and hopefully opening up some of our tooling beyond the JRuby DSL for Cascading we already have.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: