My Self-Study of CLRS: Binary insertion sort 1 (CLRS pseudocode) , 2(C) and 3(C)

Friday, May 13, 2011

Binary insertion sort 1 (CLRS pseudocode) , 2(C) and 3(C)

Binary insertion sort 1 (CLRS ):  bad

Binary insertion sort 2 (C): bad

Binary insertion sort 3 (C): bad

After knowing the binary insertion sort from Programming Pearl.  I worked out ex. 2.3-5 and give a simple description of binary insertion sort.   Easy enough.   

Though when I first implement a binary insertion sort - I did that for fun - I stumbled upon how to get it correct - a normal insertion sort would only give you whether the searched value is located in the array or not.   It doesn't automatically tell you where the search value located within the array.    This makes me realize my first version was incorrect. 

So a simple plug-in of the iterative version of binary search within an insertion sort doesn't automatically work.   

Luckily, this is not a tough problem, if you are using a version of binary search which always use exclusive bound, i.e. the one which always exclude the mid point.  Then in the case when the search fails, the searched value would be in between lo and lo+1.  So, you just need to do a shift from lo to j+1, then your job is done.  

What about the mid point and the key is the same then? If stability is not a concern, you can simply insert the item right after the mid point.  If stability is a concern, then you can look backward to see if there are same value from the mid point. 

Why is it the case?  Well, first observe that, when there are only 1 items left in the range, we can only have l=r. In the first case, if A[m]=A[l]<key or A[m]=A[l]=key, then both insertion points would be l. (i.e. we will insert after l).  Just a logical check, can A[m]> key? Impossible, because previous search step would make the range to be the left of l. 

This is all logical.  So my first implementation looks like

     1    #include <stdio.h>

     2    #define N 20

     3    int main(int c, char *v[])

     4    {

     5      int A[N]={10,7,8,3,2,1,8,12,14,15,

     6                17,8,9,11,13,7,8,12,14,5};

     7      int i,j,k,l,h,m;

     8      for(j=1;j<N;j++){

     9        k=A[j];

    10        h=j-1;

    11        l=0;

    12        while(l<=h){

    13          m = l + (h-l)/2;

    14          if(k<A[m]){

    15            h=m-1;

    16            i = l;

    17          }

    18          else if(k==A[m]){

    19            i = m;

    20            break;

    21          }

    22          else {/* k> A[m]*/

    23            l=m+1;

    24            i = l;

    25          }

    26        }

    27        m=j-1;

    28        while (m>=i)

    29          A[m+1]=A[m--];

    30        A[m+1]=k;

    31      }

    32      for(i=0;i<N;i++)

    33        printf("%d ",A[i]);

    34      printf("\n");

    35      return 0;

    36    }

This becomes my binary insertion sort 2.  I feel confident and tell myself that this search is likely to be correct (humble, humble). 

But it's obvious that the binary search part can be shorter.  So here's another possibility.  if in the last iteration, both cases for A[m]>key and A[m]=key give us the insertion point = l.  Can we just collapse these two cases?  As it turned out, one can do it.  Because in iterations other than the last, when A[m]=key, you can just keep on searching.  So here comes another implementation, 

     1    #include <stdio.h>

     2    #define N 20

     3    int main(int c, char *v[])

     4    {

     5      int A[N]={10,7,8,3,2,1,8,12,14,15,

     6                17,8,9,11,13,7,8,12,14,5};

     7      int j,k,l,h,m;

     8      for(j=1;j<N;j++){

     9        k=A[j];

    10        h=j-1;

    11        l=0;

    12        while(l<=h){

    13          m = l + (h-l)/2;

    14          if(k<=A[m])

    15            h=m-1;

    16          else /* k> A[m]*/

    17            l=m+1;

    18        }

    19        m=j-1;

    20        while (m>=l)

    21          A[m+1]=A[m--];

    22        A[m+1]=k;

    23      }

    24      for(j=0;j<N;j++)

    25        printf("%d ",A[j]);

    26      printf("\n");

    27      return 0;

    28    }

This is a shorter program. Not everything is cleaned up. Logical argument makes me believe that the program is correct.  

I stopped at this point, fully aware that this algorithm is unstable (Consider sequence with numbers to be all the same), if one really wants it to be stable, you might just search it leftward.  There is another disadvantage - the search now always run lgN+1 iteration.

So BIS3 is unlikely my final binary insertion search.  Given I am hungry though, I will let it go for today

33_P

My Self-Study of CLRS

Shortcut for chapter specific information

Friday, May 13, 2011

Binary insertion sort 1 (CLRS pseudocode) , 2(C) and 3(C)

No comments:

Post a Comment