Simple Workflow Engine, Re-designed and Simplified (even more)

Redesigning SWE, a simple workflow engine:

I totally redesigned the SWE project, as I had to use the state machines in one of my own projects, and learned from its experience. The API for SWE is highly simplified. Everything gets done with only one public class “SateMachine<SO, RT>”. This class implements  four interfaces:

  • IStateMachineBuilder
  • IStateMachineController
  • IStateMchine
  • IStableStateMachine

To create and use a state machine is very simple. Each state has also got a very simple concept.

All that matters is:
State: Has a name, does some stuff, and decides what the next step is.

Below is a simple Light; on – off example:

Var smBuilder = StateMachine<Lamp, Lamp>.Create("On")
.State("On", (s, l) =>
{
    l.IsOn = true;
    return StateMachine.NamedState("Off");
})
.State("Off", (s, l) =>
{
    l.IsOn = false;
    return StateMachine.NamedState("On");
});
Var sm = smBuilder.BindTo(theLamp);

//Usage: each call, alternates the lamp between on and off.
While(true){
    sm.Process();
    Thread.Sleep(10);
}

In another post, I show a more advanced use of swe, with a csv parser.

Kaggle is good start, but needs to be smarter

Kaglle.com is a website that hosts competitions for data scientists. Regardless of what data scientist means, and where is the line between data scientist (who invents techniques and algorithms) and data analyst (who uses existing tools and techniques to mine knowledge from data), the site is a good start.

I understand the business idea behind kaggle is partly collecting anonymous data analysts to solve data problems for enterprise. In return, individuals get rewards and businesses get their problem solved. Excellent win-win idea, but not so well implemented.

Personally, I tried to engage in the competitions, but I had a hard time motivating myself. Like (possibly) most users of the site, I was not there to win a competition, it would be great if that happens, but what would turn me on was learning from others. Unfortunately, kaggle did not put specific thoughts to promote collaboration. I was hoping to team up with experts, but I faced a real competition environment. Minimum openness and collaboration. This is a good model for sports competitions, and maybe the 1% whom are good enough and just want to win a competition and get some recognition, but wastes all the talents that can do good data analysis but don’t have the skill or will to win a competition.

The market for data analysis is huge, and someone has to start winning it. An average result from the data analysis is actually as valuable as the best possible result. Remember, the renowned Netflix competition, after 2 years and thousand of competitors, the estimation was only 10% better. Let’s face it. World’s top data analysts wouldn’t care about $2,000 prize or a free trip to a conference. Young and fresh data analysts also won’t get enough learning in Kaggle. Data analysis is not as rewarding as hacking is for teenagers, they should understand math, statistics, and maybe algorithms, so hacking competition style doesn’t work for data analysis.

I believe Kaggle is not going to win any significant share in the available market for crowd sourced data analysis. Another start-up with a better approach targeting average data analysts (or scientists if you like) is going to shine sooner or later.

A spinner and progress for xaml, wpf.

I wanted a spinner control, and a nice one. Google searches did not help, although heaps of recommendations and a few free controls could be found, non of them was satisfying enough for what I wanted to do. And of-course I didn’t want some C# code-behind. So I opened XamlPadX and started building one for myself.

Result is something like this and I am happy with it:

It is nothing more than a template for ProgressBar. By the way, setting the IsInditerminate  to True makes the number in the middle to disappear, resulting a generic wait spinner.

Below is the xaml to achieve the spinner progress bar, you can put it some where in your resources and use it:

<Grid.Resources>
 <Style TargetType="ProgressBar" x:Key="SpinnerProgress">
 <Setter Property="Template">
 <Setter.Value>
 <ControlTemplate TargetType="ProgressBar">
 <Grid>
 <ProgressBar x:Name="pgBar" Value="{TemplateBinding Value}" Visibility="Collapsed" IsIndeterminate="{TemplateBinding IsIndeterminate}"/>
 <TextBox Background="Red" Text="{Binding ElementName=pgBar, Path=Value}">
 <TextBox.Style>
 <Style TargetType="{x:Type TextBox}">
 <Setter Property="Template">
 <Setter.Value>
 <ControlTemplate TargetType="TextBox">
 <Grid>
 <StackPanel HorizontalAlignment="Center" VerticalAlignment="Center" Orientation="Horizontal">
 <TextBlock Text="{TemplateBinding Text}">
 <TextBlock.Style>
 <Style TargetType="TextBlock">
 <Style.Triggers>
 <DataTrigger Binding="{Binding ElementName=pgBar, Path=IsIndeterminate}" Value="True">
 <Setter Property="Visibility" Value="Collapsed"/>
 </DataTrigger>
 </Style.Triggers>
 </Style>
 </TextBlock.Style>
 <TextBlock Text="%"/>
 </TextBlock>
 </StackPanel>
 <Grid Width="50" Height="50" x:Name="mainGrid">
 <Grid.RenderTransform>
 <RotateTransform Angle="0" CenterX="25" CenterY="25" />
 </Grid.RenderTransform>
 <Grid.Triggers>
 <EventTrigger RoutedEvent="Grid.Loaded">
 <BeginStoryboard>
 <Storyboard RepeatBehavior="Forever">
 <DoubleAnimation From="0" To="360" Duration="00:00:00.750" Storyboard.TargetName="mainGrid" Storyboard.TargetProperty="(Grid.RenderTransform).(RotateTransform.Angle)" />
 </Storyboard>
 </BeginStoryboard>
 </EventTrigger>
 </Grid.Triggers>

<Border x:Name="spinnerMask" BorderThickness="6" CornerRadius="25" BorderBrush="White"/>
 <Rectangle>
 <Rectangle.OpacityMask>
 <VisualBrush Visual="{Binding ElementName=spinnerMask}"/>
 </Rectangle.OpacityMask>
 <Rectangle.Fill>
 <LinearGradientBrush EndPoint="1,0.5" StartPoint="0,0.5">
 <GradientStop Color="#FF068206" Offset="0" />
 <GradientStop Color="#FF72CE72" Offset="0.198" />
 <GradientStop Color="#FFC2FFC2" Offset="0.48" />
 <GradientStop Color="#FFC2FFC2" Offset="0.52" />
 <GradientStop Color="#FF72CE72" Offset="0.891" />
 <GradientStop Color="#FF068206" Offset="1" />
 </LinearGradientBrush>
 </Rectangle.Fill>
 </Rectangle>
 <Rectangle Fill="#FF068206">
 <Rectangle.OpacityMask>
 <VisualBrush Visual="{Binding ElementName=spinnerMask}"/>
 </Rectangle.OpacityMask>
 <Rectangle.Clip>
 <RectangleGeometry Rect="0,0,50,25"/>
 </Rectangle.Clip>
 </Rectangle>
 </Grid>
 </Grid>
 </ControlTemplate>
 </Setter.Value>
 </Setter>
 </Style>
 </TextBox.Style>
 </TextBox>
 </Grid>
 </ControlTemplate>
 </Setter.Value>
 </Setter>
 </Style>
 </Grid.Resources>

And here is how to use it:

<ProgressBar Style="{StaticResource ResourceKey=SpinnerProgress}" IsIndeterminate="False" Value="12"/>

Xaml localization: Using .resx Resources in Xaml without x:static

The Microsoft recommended approach to xaml localization is using the locbaml tool. To be honest, it sucks! It has a lot of manual work and is not compatible with previous models.

Many companies have already lots of infrastructure in place for automatic support of localization. Replacing the whole infrastructure, for just a few recently added Silverlight or WPF projects does not make sense.

People give various suggestions, first thing you hear is to use x:static and .resx files. It is alright, and if it works for you then bingo. But designer does not play well with x:static, also it only works if you use the default Visual Studio resource generator to generate classes. In general I don’t like this approach.

Good news is that Dynamic objects are lovely sometimes. Xaml can use them a lot, so I create a dynamic object called ResourceLoader, and use it to load my resources from the .resx file. Then I can use binding (withe the mode=OneTime for better performance) to load resources for the right local.

<TextBlock x:Name="LoginLable" Text="{Binding Source={StaticResource ResourceKey=Res},Path=LogonText, Mode=OneTime}">
</TextBlock>

The only additional thing is to add the static resource which is the ResourceLoader somewhere in the logical tree.

<local:ResourceLoader x:Key="Res" Assembly="LogonDialog" />

The attribute “Assembly” defines the assembly which resource should be loaded from, otherwise the ResourceLoader assembly would be selected which can be wrong depending on the solution structure and location of resources.

You can download the ResourceLoader.cs from here.

Please note that you should rename the above file ResourceLoader.cs, because wordpress does not allow uploading .cs files.

How to run most of the recursive functions iteratively!

A Helper to run recursive functions iteratively

Last night I wanted to write an iterative binary tree traversal algorithm. I first wrote it recursively, but I struggled to translate it to iterative! I was mentally exhausted. I had only 3 hours of sleep the previous night and I was up for more than 12 hours, so I decided to draw a tree to get some visual help. When I gazed to my tree, I realized nodes are doing marshal art! I eventually couldn’t come up with the right solution after half an hour of trying to convince my brain to get up. This morning when I woke up I thought about the problem again and I solved it in 10 seconds. I am not here to talk to you about how to make the tree traversal iterative, I want to tell you how to make many recursive functions iterative.

I decided to write a helper, such that next time I wanted to convert a recursive function to iterative even if I was sleep, be able do it with closed eyes. Although this helper may not work for all cases it works for many common ones.

Why do we want to make a recursive function iterative?

There are two answers to this question. First answer is simply performance. For simple solutions the iterative functions performs much better than the recursive one due to the overhead of function calls and stack operations. For more complex solutions though, this overhead can become negligible comparing to the actual work done at each recursive step.

Second reason for making a recursive call iterative is to avoid Stack Overflow. I am talking about the exception you get when the function call stack is full, not the website that you can ask programing questions.

I can’t help you with the first problem, but I can help you avoid stack overflow.

How to make it iterative?

There is an straightforward way to translate recursive functions to iterative ones using a stack. Basically by simulating what actually happens at machine level with an object which we call the RecursionHelper.

First divide your recursion code to subrecursive chunks. Generally a recursive function has three parts: Termination condition, recursive calls, and stuff to do on the current object. We have to divide the function into steps. For example consider recursively calculating sum of the items in an array:

int Sum(int index, int[] array)
{
 if (int >= array.Length) return 0;

var sumofrest = Sum(index+1, array);
 return array[index]+sumofrest;
}

Let us divide the above recursive function to its steps:

int Sum(int index, int[] array)
{
 //This is the termination condition
 if (int >= array.Length)
 //This is the returning value when termination condition is true
 return 0;

//This is the recursive call
 var sumofrest = Sum(index+1, array);

//This is the work to do with the current item and the
 //result of recursive call
 return array[index]+sumofrest;
}

Now send all this steps to the Recursive Helper:

int Sum(int[] ar)
{
 return RecursionHelper<int>.CreateSingular(i => i >= ar.Length, i => 0)
 .RecursiveCall((i, rv) => i + 1)
 .Do((i, rv) => ar[i] + rv)
 .Execute(0);
}

First line creates a recursionHelper and passes in the termination condition. Here termination condition is if i>= ar.Length, and if the condition is true function i => 0 will execute and 0 is returned.
The next two lines are definition of the recursive function divided into steps: Recursive call, which calls the function recursively with i+1, and the result comes back as “rv” for the next step. The “Do” steps does some work and returns (or doesn’t return) some result which can be accessible by other parts of the function. In this case our function definition is finished.
The Execute methods starts the recursive call with initial value of 0.

Let us do this with the binary tree in-order traversal problem:

 void PrintTreeNodes(Node node)
 {
 if (node == null) return; //Termination condition
 PrintTreeNodes(node.Left); //Recursive call 1
 Console.WriteLn(node.Value); //Do stuff
 PrintTreeNodes(node.Right); //Recusive call 2
 }
 

With RecursionHelper:

 void PrintTreeNodes(Node root)
 {
 RecursionHelper<Node>.CreateSingular(n => n == null)
 .RecursiveCall((n, rv) => n.Left)
 .Do((n, o) => Console.Write("{0},", n))
 .RecursiveCall((n, o) => n.Right)
 .Execute(root);
 }
 

The first line sets up the termination condition (n==null). It does not return anything because it is a void function. Next three lines call recursive functions and do the job. Eventually execute runs the function with initial value of root.

You see it is pretty easy to convert any recursive function to iterative. You don’t have to change any code. You should only wrap each nonrecursive part in a Do method and change the recursive call with RecursiveCall method.

Here is the code for RecursionHelper<T>:

 public class RecursionHelper<T>
 {
 private readonly List<Tuple<Func<T, T, T>, bool>> _steps = new List<Tuple<Func<T, T, T>, bool>>();
 private readonly Stack<Tuple<T, int>> _stack = new Stack<Tuple<T, int>>();
 private Func<T, bool> _terminationCondition;
 private Func<T, T> _terminationOperation;

/// <summary>
 /// Creates a single stack recursion manager.
 /// </summary>
 /// <typeparam name="TR">Type of item to recurse for</typeparam>
 /// <param name="terminateCondition">The terminate condition.</param>
 /// <param name="terminationOperation">Operation to run in case termination was true.</param>
 /// <returns></returns>
 public static RecursionHelper<T> CreateSingular(Func<T, bool> terminateCondition, Func<T, T> terminationOperation = null)
 {
 var rv = new RecursionHelper<T>
 {
 _terminationCondition = terminateCondition,
 _terminationOperation = terminationOperation
 };
 return rv;
 }

public RecursionHelper<T> RecursiveCall(Func<T, T, T> func)
 {
 addStep(func, true);
 return this;
 }

public RecursionHelper<T> Do(Func<T, T, T> func)
 {
 addStep(func, false);
 return this;
 }

public RecursionHelper<T> Do(Action<T, T> action)
 {
 addStep((i, o) =>
 {
 action(i, o);
 return o;
 }, false);
 return this;
 }

private void addStep(Func<T, T, T> func, bool isRecursive)
 {
 _steps.Add(Tuple.Create(func, isRecursive));
 }

public T Execute(T initialItem)
 {
 var currentItem = initialItem;
 var currentResult = default(T);
 var currentStep = 0;
 while (true)
 {
 var recursiveContinue = false;
 if (currentStep == 0 && _terminationCondition(currentItem))
 {
 currentResult = _terminationOperation(currentItem);
 }
 else
 for (int index = currentStep; index < _steps.Count; index++)
 {
 var step = _steps[index];
 if (step.Item2) //Step is recursive
 {
 _stack.Push(Tuple.Create(currentItem, index + 1)); //Push the current position and value
 currentItem = step.Item1(currentItem, currentResult);
 recursiveContinue = true;
 break;
 }
 currentResult = step.Item1(currentItem, currentResult);
 recursiveContinue = false;
 }
 currentStep = 0;
 if (!recursiveContinue)
 {
 //Once a function has finished it works, pop the stack and continue from where it was before
 if (_stack.Count == 0)
 return currentResult;
 var stackPopped = _stack.Pop();
 currentItem = stackPopped.Item1;
 currentStep = stackPopped.Item2;
 }
 }
 }
 }
}

Algorithm to find two repeated numbers in an array, without sorting

I saw this question in Stack Overflow which I found interesting.

This is the link to the actual question on StackOverflow.com.

There is an array of size n (numbers are between 0 and n – 3) and only 2 numbers are repeated. Elements are placed randomly in the array.

E.g. in {2, 3, 6, 1, 5, 4, 0, 3, 5} n=9, and repeated numbers are 3 and 5.

What is the best way to find the repeated numbers? Space complexity should not be more than O(1) and time complexity should not exceed O(n). Note thatn O(1) means constant amount of memory where memory consumption is not relative to the size of n and O(n) means you can look into the array only a constant number of times.

HINT: Sorting, or using a hash table is not the answer.

One solution is to sum the numbers and also sum the squares then solve the problem by solving the following math equation: A+B=cte and A^2+B^2=anotherCte.

I would suggest a different solution:

As the numbers are integers, they have constant bit size (i.e. 32). Let us assume we are working with 4 bit integers right now. We look for A and B which are the duplicate numbers.

We need 4 buckets, each for one bit. Each bucket contains numbers which its specific bit is 1. For example bucket 1 gets 2, 3, 4, 7, …:

Bucket 0 : Sum ( x where: x & 2 power 0 == 0 )
...
Bucket i : Sum ( x where: x & 2 power i == 0 )

We know what would be the sum of each bucket if there was no duplicate. I consider this as prior knowledge.
Once above buckets are generated, a bunch of them would have values more than expected. By constructing the number from buckets we will have (A OR B for your information).

We can calculate (A XOR B) as follows:

A XOR B = Array[i] XOR Array[i-1] XOR … 0, XOR n-3 XOR n-2 … XOR 0
Now going back to buckets, we know exactly which buckets have both our numbers and which ones have only one (from the XOR bit).

For the buckets that have only one number we can extract the number num = (sum – expected sum of bucket). However, we should be good only if we can find one of the duplicate numbers so if we have at least one bit in A XOR B, we’ve got the answer.

But what if A XOR B is zero? Well this case is only possible if both duplicate numbers are the same number, which then our number is the answer of A OR B.

Example:

Given array of numbers:

0, 1, 2, 3, 4, 5, 2, 3

Let’s write them in Binary format:

 0 => 0 0 0
 1 => 0 0 1
 2 => 0 1 0
 3 => 0 1 1
 4 => 1 0 0
 5 => 1 0 1
 2 => 0 1 0
 3 => 0 1 1

Let us calculate the XOR (^) first: (number xor index)

0^0 ^ 1^1 ^ 2^2 ^ 3^3 ^ 4^4 ^ 2 ^ 3 => 0 0 1

Then we create 3 bit buckets by summing all number where bucket bit is 1:

Expected bucket values:
 B3(2) B2(5) B1(9)
Actual bucket values:
 B3(2) B2(10) B1(12)

From the buckets we can say A|B = 0 1 1 which is correct 0 1 0 | 0 1 1 = 0 1 1.
Using the result of XOR, we know that bucket B1 contains only one of the extra numbers. This is because when A|B is 1, and A^B=1, then one of the numbers should be 0 and the other should be 1.
So we can calculate one number : 12 – 9 = 3. The other number is 10-3-5 = 2.

WordPress reminded me of Iran today

I opened the wordpress home page this morning and suddenly I felt like I am using internet in Iran. No image was visible and all the images were censored.

This might be a weird concept for you but it is the most common internet page for Iranians. Iranian regime blocks inappropriate material, and you know how governments work, level of inappropriateness grows throughout the time and once a black censored box appeared in your browser, it would be just a matter of time for them to fill up your screen.

Please protest against SOPA http://sopastrike.com/strike. Even if you are not American you will be affected with american government’s decision on internet, so let our voice be heard.

To realize how dangerous this bill is, please check this good article.

Visual guide to Portable Version

 

Synchronize your paper with supervisor quick and easy

One issue I always had with writing papers was to synchronize my work with other co-writers. The process is normally like that I write the draft version of the paper. Forward it to my supervisor and other co-writers. They apply the changes and send the paper back to me, so that I can merge the changes. This process is utterly painful. I prefer Latex to Word, and unfortunately there is not an easy way to track changes in Latex. In word, you can activate track changes and pass the document around but not in Latex.
What I do now, is that I write my Latex draft. Check it out using Portable Version. Send in the whole folder to co-authors (a .zip file). They do what they want to the latex folder and files and pass it back to me. I can check in their changes to the folder one by one and work out the diffs and modifications to the folder like newly added images or the files moved around.
To make Portable Version work with Latex files, I just have to disable the TFS integration. Currently the way to do this is to edit “Tfs.Config” file from where portable version is installed and replace every command with a harmless command (i.e. Command = “Cmd /c “ and every parameter = “ echo %1”)

Best Practices to Use Portable version

Portable Version (PV) can make your life much sweeter if you use it properly. Note that PV in not a full fledge version control system. In contrary it is designed for quick and dirty synchronization between temporary copies (called slave) and the version controlled copy (called master). Example below depicts some situation where PV may be used:
I have a few solutions at work which are quiet large. Each solution contains more than 100 projects and as you would imagine there are many policies around checking in stuff to the TFS server. Also any access to the work network involve a lot of IT politics which every developer wants to avoid.
This weekend I want to go home early and finish the task I was up to, at Sunday afternoon. I have a copy of the solution on my flash drive which I delete and overwrite time to time. If I work on my copy of the solution Sunday afternoon, I have to remember all the changes next morning and re-apply them to the solution on my work machine. Of course I can’t just copy the solution back because that would screw the TFS up.
This is when PV comes to rescue. I simply check out the project folder I want to work on to its folder on my flash drive (right click on folder and select checkout), take it home and work on it Sunday afternoon. Monday when I come back to work, I check the project back in (right click on .pvml file and select check in). PV takes care of all necessary actions for making sure TFS knows what you have changed. Even if I rename or move a file in my flash drive, PV will recognize that and run the “tf –rename “ command to apply my change to TFS. (Currently support for renaming and moving folders makes check-in a bit tricky and you may need to modify the .pvml file for best results, but renaming and moving files works nice).
Bottom-line is that, I should keep my changes short and small. PV is not a commercial product and I do not guaranty any loss or confusion of information, hence, if you want to check the whole solution of 200 project out using PV and take the solution to your round the world cruise and come back and synchronize your 90 days worth of change with TFS, you better make sure you understand how PV works exactly and also take a good look at the source code (from codeplex).