Kaggle is good start, but needs to be smarter


Kaglle.com is a website that hosts competitions for data scientists. Regardless of what data scientist means, and where is the line between data scientist (who invents techniques and algorithms) and data analyst (who uses existing tools and techniques to mine knowledge from data), the site is a good start.

I understand the business idea behind kaggle is partly collecting anonymous data analysts to solve data problems for enterprise. In return, individuals get rewards and businesses get their problem solved. Excellent win-win idea, but not so well implemented.

Personally, I tried to engage in the competitions, but I had a hard time motivating myself. Like (possibly) most users of the site, I was not there to win a competition, it would be great if that happens, but what would turn me on was learning from others. Unfortunately, kaggle did not put specific thoughts to promote collaboration. I was hoping to team up with experts, but I faced a real competition environment. Minimum openness and collaboration. This is a good model for sports competitions, and maybe the 1% whom are good enough and just want to win a competition and get some recognition, but wastes all the talents that can do good data analysis but don’t have the skill or will to win a competition.

The market for data analysis is huge, and someone has to start winning it. An average result from the data analysis is actually as valuable as the best possible result. Remember, the renowned Netflix competition, after 2 years and thousand of competitors, the estimation was only 10% better. Let’s face it. World’s top data analysts wouldn’t care about $2,000 prize or a free trip to a conference. Young and fresh data analysts also won’t get enough learning in Kaggle. Data analysis is not as rewarding as hacking is for teenagers, they should understand math, statistics, and maybe algorithms, so hacking competition style doesn’t work for data analysis.

I believe Kaggle is not going to win any significant share in the available market for crowd sourced data analysis. Another start-up with a better approach targeting average data analysts (or scientists if you like) is going to shine sooner or later.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: