Simple Workflow Engine, Re-designed and Simplified (even more)

Redesigning SWE, a simple workflow engine:

I totally redesigned the SWE project, as I had to use the state machines in one of my own projects, and learned from its experience. The API for SWE is highly simplified. Everything gets done with only one public class “SateMachine<SO, RT>”. This class implements  four interfaces:

  • IStateMachineBuilder
  • IStateMachineController
  • IStateMchine
  • IStableStateMachine

To create and use a state machine is very simple. Each state has also got a very simple concept.

All that matters is:
State: Has a name, does some stuff, and decides what the next step is.

Below is a simple Light; on – off example:

Var smBuilder = StateMachine<Lamp, Lamp>.Create("On")
.State("On", (s, l) =>
{
    l.IsOn = true;
    return StateMachine.NamedState("Off");
})
.State("Off", (s, l) =>
{
    l.IsOn = false;
    return StateMachine.NamedState("On");
});
Var sm = smBuilder.BindTo(theLamp);

//Usage: each call, alternates the lamp between on and off.
While(true){
    sm.Process();
    Thread.Sleep(10);
}

In another post, I show a more advanced use of swe, with a csv parser.

Kaggle is good start, but needs to be smarter

Kaglle.com is a website that hosts competitions for data scientists. Regardless of what data scientist means, and where is the line between data scientist (who invents techniques and algorithms) and data analyst (who uses existing tools and techniques to mine knowledge from data), the site is a good start.

I understand the business idea behind kaggle is partly collecting anonymous data analysts to solve data problems for enterprise. In return, individuals get rewards and businesses get their problem solved. Excellent win-win idea, but not so well implemented.

Personally, I tried to engage in the competitions, but I had a hard time motivating myself. Like (possibly) most users of the site, I was not there to win a competition, it would be great if that happens, but what would turn me on was learning from others. Unfortunately, kaggle did not put specific thoughts to promote collaboration. I was hoping to team up with experts, but I faced a real competition environment. Minimum openness and collaboration. This is a good model for sports competitions, and maybe the 1% whom are good enough and just want to win a competition and get some recognition, but wastes all the talents that can do good data analysis but don’t have the skill or will to win a competition.

The market for data analysis is huge, and someone has to start winning it. An average result from the data analysis is actually as valuable as the best possible result. Remember, the renowned Netflix competition, after 2 years and thousand of competitors, the estimation was only 10% better. Let’s face it. World’s top data analysts wouldn’t care about $2,000 prize or a free trip to a conference. Young and fresh data analysts also won’t get enough learning in Kaggle. Data analysis is not as rewarding as hacking is for teenagers, they should understand math, statistics, and maybe algorithms, so hacking competition style doesn’t work for data analysis.

I believe Kaggle is not going to win any significant share in the available market for crowd sourced data analysis. Another start-up with a better approach targeting average data analysts (or scientists if you like) is going to shine sooner or later.