Shortcut for chapter specific information

Thursday, June 30, 2011

My linear algorithm for maximum subarray problem

I can proudly say I developed this algorithm.  But I certainly need to look at the hint of Programming Pearl Chapter 8 as well as the hint at 4-1.5.  Though, make it working myself still makes me feel happy.

It is arguable that the algorithm could be too long so I might want to read other on-line references. 

Should this be part of practice routine?  It sounds like so.  Though, it is more thought-provoking to finish exercises at the end of Programming Pearl Chapter 8.

#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#define N 100
#define M 100000

struct arr_struct { int b;  int e;  int s; };

int main(int c, char* v[])
{
  int A[N];
  int i,k;

  struct arr_struct max_array, max_suffix_array;

  for(k=0;k<M;k++){
    for(i=0;i<N;i++){
      A[i]= 1 + (int ) (N * (float)rand()/(float)RAND_MAX) - N/2 ;
      //      printf("%d:%d ",i,A[i]);
    }
    //    printf("\n");
   
    max_array.s = A[0];
    max_array.b = 0;
    max_array.e = 0;
   
    max_suffix_array.s = A[0];
    max_suffix_array.b = 0;
    max_suffix_array.e = 0;
   
    for(i=1;i<N;i++){
      if(max_suffix_array.s>0)
        max_suffix_array.s+=A[i];
      else{
        max_suffix_array.s=A[i];
        max_suffix_array.b=i;
      }
      max_suffix_array.e=i;
     
      if(max_array.s < max_suffix_array.s){
        max_array.s = max_suffix_array.s;
        max_array.b = max_suffix_array.b;
        max_array.e = max_suffix_array.e;
      }
    }

    // printf("Begin %d\n",max_array.b);
    // printf("End %d\n",max_array.e);
    // printf("Sum %d\n",max_array.s);
  }
  return 0;
}

status 20110630(b)

Finished 4.1.5.  Yep, the algorithm works.  But before I post it.  Let me make sure I read the best literature on this problem first. 

Browse reading: 450/1144
Exercises up to 4-2.1
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, 4.2-6, 4.2-7, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), P4.6b, 5.1-{1,2}, whole Section 6.1, 6.2-{1,2,4,5,6}, 6.3-1, whole Section 7.1, 7.2-{1-3}, whole Section 7.3 and 7.4.3, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

status 20110630

Finished 4.1-4 and wrote a pseudocode in 5. I think the answer for 5 is correct but I hope to test it with an implementation.  So let's play with it for a while.

Browse reading: 450/1144
Exercises up to 4-1.5
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, 4.1-5, 4.2-6, 4.2-7, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), P4.6b, 5.1-{1,2}, whole Section 6.1, 6.2-{1,2,4,5,6}, 6.3-1, whole Section 7.1, 7.2-{1-3}, whole Section 7.3 and 7.4.3, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Tuesday, June 28, 2011

Status 20110628

Finished 4.1-3.  I can finally move on.  Schedule is killing me but my heart is still on this.  I am happy when I am persistent.

Browse reading: 450/1144
Exercises up to 4-1.4
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, 4.1-4, 4.1-5, 4.2-6, 4.2-7, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), P4.6b, 5.1-{1,2}, whole Section 6.1, 6.2-{1,2,4,5,6}, 6.3-1, whole Section 7.1, 7.2-{1-3}, whole Section 7.3 and 7.4.3, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Measurement in 4.1-3

Final measurement of the two algorithms n-square and nlogn.  In my case the crossover point is 67.  There is no change after combining with the base case.  Why should it be? Except the extra logic of the if statement that one would put, it doesn't seem to make sense that the cross over would change.

All the conditions are measured seconds and in each condition, the algorithm was exercised for 100k times.

          n^2     nlogn
N=10      0.080   0.116
N=50      0.728   0.836
N=62      0.732   1.020
N=65      1.100   1.116
N=66      1.100   1.116
N=67      1.148   1.144  (*crossover point*)
N=68      1.188   1.156
N=75      1.388   1.288
N=76      1.418   1.308
N=78      1.492   1.424
N=82      1.620   1.408
N=88      1.852   1.536
N=100     2.352   1.812
N=1000  170.210  19.391

I consider myself finished 4.1-3. 

Sunday, June 26, 2011

Status 20110626

Finished in my head : 4.1-4.  I also finished the implementations of 4.1-3.  Good progress. Both of them takes some thought though it doesn't take more than normal amount of creativity

Browse reading: 450/1144
Exercises up to 4-1.3(finished the brute force implementation)
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, 4.1-4,4.1-5, 4.2-6, 4.2-7, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), P4.6b, 5.1-{1,2}, whole Section 6.1, 6.2-{1,2,4,5,6}, 6.3-1, whole Section 7.1, 7.2-{1-3}, whole Section 7.3 and 7.4.3, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Finished the recursive algorithm at 4-1.3!

I am quite cite.  It's cute program and it's not hard to understand once you know the algorithm.  Ok, I will do the measurement tomorrow.

#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#define N 10000

struct arr_struct { int b;  int e;  int s; };

void find_max_crossing_subarray(int a[],int l, int m, int h, struct arr_struct* as_p)
{
  int left_sum = -INT_MAX;
  int sum =0 ;
  int max_left =-1;
  int max_right =-1;
  int i,j;
  for( i = m ; i >=l ; i--){
    sum += a[i];
    if(sum > left_sum){
      left_sum = sum;
      max_left = i;
    }
  }
  int right_sum = -INT_MAX;
  sum =0 ;
  for( j = m+1 ; j <=h ; j++){
    sum += a[j];
    if(sum > right_sum){
      right_sum = sum;
      max_right = j;
    }
  }
  as_p->b = max_left;
  as_p->e = max_right;
  as_p->s = left_sum + right_sum;
}
 
void find_maximum_subarray(int a[],int l, int h, struct arr_struct *as_p)
{
  if(h==l){
    as_p->b = l;
    as_p->e = h;
    as_p->s = a[l];
  }else{
    int m;
    struct arr_struct left, right, cross;
    m= l + (h-l)/2;
    find_maximum_subarray(a,l,m,&left);
    find_maximum_subarray(a,m+1,h,&right);
    find_max_crossing_subarray(a,l,m,h,&cross);

    if(left.s >= right.s && left.s >= cross.s){
      as_p->b = left.b;
      as_p->e = left.e;
      as_p->s = left.s;
    }else if(right.s >= left.s && right.s >= cross.s){
      as_p->b = right.b;
      as_p->e = right.e;
      as_p->s = right.s;
    }else{
      as_p->b = cross.b;
      as_p->e = cross.e;
      as_p->s = cross.s;
    }
  }
}


int main(int c, char* v[])
{
  int A[N];
  int i;
  struct arr_struct max_array;
  for(i=0;i<N;i++){
    A[i]= 1 + (int ) (N * (float)rand()/(float)RAND_MAX) - N/2 ;
    //    printf("%d:%d ",i,A[i]);
  }
  //  printf("\n");
  find_maximum_subarray(A,0,N-1,&max_array);

  printf("Begin %d\n",max_array.b);
  printf("End %d\n",max_array.e);
  printf("Sum %d\n",max_array.s);

  return 0;
}

Ah. Still hadn't finished 4-1.3

I guess this is a problem.  Today I am going to spend some of my coffee time to just get the program written and tested.

Wednesday, June 22, 2011

Deadline for 4.1-3

Ok. I want to set a time for myself to resolve 4.1-3.   It's just a matter of implementation.  But I am just not in the flow of doing it.  (Plus I have been busy other than work and algorithmic study.)

So I will try to do it by Thursday night.  Let's hope this will work out.

Concrete Mathematics Chapter 7

Read the whole thing about generating function.  As the authors suggest, this is perhaps the most powerful method to calculate the recurrent sequenece.  

It has almost magical effect on simple series.  At this point, I probably don't wield well enough because it is a very technique centric skill.   The good part is that it is very similar to digital signal processing.   So it sounds like it is possible to manipulate them using similar method.   For example does FFT works for a series?

It will be fun if this idea can work out for the recurrence I am thinking.

Tuesday, June 21, 2011

status 20110621

Finished reading Chapter 16 on greedy algorithms.  I can understand activity scheduling and the knapsack problems.  It's also very illuminating. But I got to admit relating them with dynamic programming and matroid theory is currently beyond me. 

My conclusion is this - for this chapter and chapter 15 on dynamic programming.  It's very important to understand how the problems are related.  Though I know how to implement the algorithms, without understanding them, it's hard to go beyond the superficial.  

For now, I will leave it to my another reread.

Browse reading: 450/1144
Exercises up to 4-1.3(finished the brute force implementation)
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, 4.1-5, 4.2-6, 4.2-7, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), P4.6b, 5.1-{1,2}, whole Section 6.1, 6.2-{1,2,4,5,6}, 6.3-1, whole Section 7.1, 7.2-{1-3}, whole Section 7.3 and 7.4.3, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Monday, June 20, 2011

Status 20110620

Finished the brute force implementation.  Of course it is nothing difficult.  This is a place I want to stay on a while because the 3 algorithms of finding maximum subarray are quite illuminating.

Fuzzy reading: 414/1144
Browse reading: 309/1144
Exercises up to 4-1.3(finished the brute force implementation)
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, 4.1-5, 4.2-6, 4.2-7, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), P4.6b, 5.1-{1,2}, whole Section 6.1, 6.2-{1,2,4,5,6}, 6.3-1, whole Section 7.1, 7.2-{1-3}, whole Section 7.3 and 7.4.3, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Maximum subarray brute force algorithm (quadratic time) 1

#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#define N 10

int main(int c, char* v[])
{
  int A[N];
  int i,j;
  int b, e;
  int sum, max_sum;
  for(i=0;i<N;i++){
    A[i]= 1 + (int ) (N * (float)rand()/(float)RAND_MAX) - N/2 ;
    printf("%d:%d ",i,A[i]);
  }
  printf("\n");

  b=-1;
  e=-1;
  max_sum=-INT_MAX;
  for(j=0;j<N;j++){
    sum=0;
    for(i=j;i<N;i++){
      sum+=A[i];
      if(sum>max_sum){
        max_sum=sum;
        b=j;
        e=i;
      }
    }
  }
  printf("max_sum %d b %d e %d\n",max_sum, b, e);
  return 0;
}

Generating random numbers


To test several routines I wrote including sorting and the recent maximum subarray procedure.  I decide to write a simple random number generator.  
This is rather trivial provided that you got the calculation right.  In a typical implementation such as GNU. RAND_MAX is usually the INT_MAX so you also need to make sure there is no overflow.  It doesn't make this task difficult at all.  A version I wrote can be found here

#include <stdio.h>
#include <stdlib.h>
#define N 100

int main(int c, char* v[])
{
  int A[N];
  int i;
  for(i=0;i<N;i++){
    A[i]= 1 + (int ) (N * (float)rand()/(float)RAND_MAX);
    printf("%d ", A[i]);
  }
  printf("\n");
}

Friday, June 17, 2011

Programming Pearls Chapter8

This is chapter which described the now famous maximum subarray problem.  It also has the 4 layers of algorithm with different performance.   It is where I want to drill a bit more before I move on. 

When you feel stuck

  • hunker.  Try to focus on several simple issues.  Attack them first and build up confidence. 
  • do something else (but not too long).   For me, I usually spend time to work on foreign languages or just write down answers I worked out in the past in other books. 
  • Think of a simple plan to implement for your problem.  Implement it and learn something about the plan and the solutions
  • For the simple things you know, can you think of new variations of them?

Remove my stuckness

Ok.  What I did is to first work out something simple in Spanish.  My head is so stuck in algorithm that I need to do something else.

It does make me feel better.  Let's do this step by step then.

Stuckness

I start to feel stuck when I tried to repeat binary insertion sort.  Certain something I wrote but it is also so complicated when I can't recall how I made it worked.

I need to think how to avoid it.  This stuckness is unnecessary.  I probably prefer just work out bubble sort.  Then move on.

Wednesday, June 15, 2011

Daily Practice 20110615(a)

Insertion Sort 76            (C): Good
Insertion Sort w Sentinel 16 (C): Good
Selection Sort 36            (C): Good
Merge Sort 19                (C): Bad
Linear Search 9              (C): Bad
Binary Search 8              (C): Bad
 
Implement the first 6 algorithms for daily practice and it last for about an hour. After merge sort, I start to lose my focus and simple things such as linear search and binary search starts to feel tough. So I need one-off debug.  

My weakness is still on C-level syntax.  It worths the time to brush it up.  But in terms of algorithm I think I still fully grok the 6. 

Here are the two tougher ones which are bubble sort and binary_insertion_sort.  Bubble sort is easy to implement but I always mix it up with insertion sort.  Whereas binary insertion sort is plain right to complicated to implement. 

Let's split them into the second half.  Then I will also work out 4-1.3 on route.
 

Before 4.1-3

Since I haven't implemented for a while (for leisure, a month?), the blurry feeling of programming comes again

This is the moment one should not feel ashamed.  Everyone can forget something who has done.  So I decide to go back to the daily exercises and see if I can finish them now.  (or how bad I am?)

Tuesday, June 14, 2011

status 20110614(b)

Finished in my head: whole Section 7.1, 7.2-{1-3}, whole Section 7.3 and 7.4.3

Fuzzy reading: 414/1144
Browse reading: 309/1144
Exercises up to 4-1.3
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, 4.1-5, 4.2-6, 4.2-7, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), P4.6b, 5.1-{1,2}, whole Section 6.1, 6.2-{1,2,4,5,6}, 6.3-1, whole Section 7.1, 7.2-{1-3}, whole Section 7.3 and 7.4.3, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

status 20110614

Finished 4-1.2.  The next question is an implementation question.  I would hope to write down the pseudocode of the divide-and-conquer before I move on to put it in C.
Fuzzy reading: 414/1144
Browse reading: 309/1144
Exercises up to 4-1.3
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, 4.1-5, 4.2-6, 4.2-7, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), P4.6b, 5.1-{1,2}, whole Section 6.1, 6.2-{1,2,4,5,6}, 6.3-1, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Monday, June 13, 2011

status 20110613

Finished on paper: 4-1.1 and I don't expect 4-1.2 and 4-1.3 are difficult.  Though I will require myself to be precise when implementing the algorithms.

Fuzzy reading: 414/1144
Browse reading: 309/1144
Exercises up to 4-1.2
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, 4.1-1, 4.1-2, 4.1-5, 4.2-6, 4.2-7, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), P4.6b, 5.1-{1,2}, whole Section 6.1, 6.2-{1,2,4,5,6}, 6.3-1, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Sunday, June 12, 2011

Another relaxing moment

Sometimes you just feel your brain is stuck. For me, now is one of those times.  So I decided to have some fun with foreign languages at least for tonight.   Ah, there are always problems you can't think through anyway.

Status 20110612(b)


Finished in my head: P4.6b
Fuzzy reading: 414/1144
Browse reading: 309/1144
Exercises up to 4-1.1
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, 4.1-1, 4.1-2, 4.1-5, 4.2-6, 4.2-7, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), P4.6b, 5.1-{1,2}, whole Section 6.1, 6.2-{1,2,4,5,6}, 6.3-1, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Status 20110612

After some rest, I decided to put my head in the maximum subarray problem.  It's one of the problem which put me in shame in programming interview.   
But this time I think I get a grip of the problem.  In fact one of the key when you solve a problem like this is to always break it down to a subproblem and try to understand the relationship between the problem and the subproblems. I tried to break down the problem by either divide and conquer and DP.  Both gave me different answers. Unfortunately, both are wrong as well but it makes understand Section 4.1 and 4.2 very quickly.  With the hint, I was also able to work out 4.1-5 which is a linear algorithm of the maximum subarray problem.  I am quite satisfied.

Finished in my head: 4.1-1, 4.1-2, 4.1-5, 4.2-6, 4.2-7

Fuzzy reading: 414/1144
Browse reading: 309/1144
Exercises up to 4-1.1
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, 4.1-1, 4.1-2, 4.1-5, 4.2-6, 4.2-7, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), 5.1-{1,2}, whole Section 6.1, 6.2-{1,2,4,5,6}, 6.3-1, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Saturday, June 11, 2011

Looking forward

I sit down and don't feel I want to start exercise of Chapter 4.  At the same time, Chapter 16 on Greedy algorithm doesn't seem to be interesting.  The key here I guess is that there are one or two algorithms which I don't grok so I couldn't proceed.

Then I look forward, there are many interesting topics I really want to know because they are always mentioned in literature.    There are also many topics which I genuinely just want to know more.  They are perhaps topics why I am thinking person.   For example, how to work on prime number.  how to work on string.   I always wonder how it can be done.    Some of these chapter, one of this will require me to read one single book.   This is exciting.

Though in any case, today is just not the time to proceed.  Let's slow down.

To start a new chapter

Every time I finished a chapter, the toughest part is to start a new chapter.  This is the case literally because the cognitive load required on any topics of algorithmic study is high.  Especially when you try to go deep and immerse yourself in it.   As far as I can see, this is the most effective to learn.   A browsing type of technique usually doesn't help.

Of course, the high cognitive load also made me feel very tired whenever I finished one chapter.   So far, the two chapters I went through are quite independent and each of them can simulate a lot of thoughts.   I am glad that if I started to deep read them.  It could be more mind-draining. 

"Silly you, how draining can that be if you just want to learn big O?"  It depends on how serious you are, to explain this, imagine you always hear people say insertion sort is O(n^2) or the best case of insertion sort is O(n).  Do these sentences ring an alarm bell in your mind.   How about the soft oh notation? Does that reflect what O notation really is?   If you don't feel these are important issues, I am probably talking with a wall.   But if you feel these are actually difficult issues, then you are on the right spot.   Growth analysis is actually one issue which is very closed to real analysis.   Sometimes you need to have more than normal creativity to understand a certain issue in it.

As for me, after one chapter, my question is always "How can I start the next chapter? and When?"  I remember when I self-study a book in number theory.  I was quite stumbled by Legendre polynomial.  These are usual feeling.  There are always topics in math/CS books which are deeper than you thought.

My next chapter would be Chapter 4, recurrences.  Though I have worked through half of the problems already, the question remains to me : how come sometime I can't see a recursion solution of a problem.  How can I gain what so called "recursive thinking" more?  Those have been stuck in my mind for a while.   My guess is like verbs in romance language, you need to use it to really grok it.

And at a moment I feel stuck, what I will do is to reboot my life slowly by doing small things.  In fact, what I said in the first sentence is not only true literally, it's probably true metaphorically.

Friday, June 10, 2011

status 20110610

Reread the chapter on heap sort. Now I can read it through until Section 6-4.  I am yet to visualize how the heap works intuitively.  At the very least, I shouldn't just the heap is moving, I should also be able to see how the numbers are moving around in the array.  

And once again, shame on me, not able to finish the section on priority queue (yet).  But well, one more round, I will be able to penetrate the whole thing. 

Finished in my head: 6.1-4, 6.1-5, 6.1-6, 6.2-1, 6.2-6, 6.3-1.

Fuzzy reading: 414/1144
Browse reading: 309/1144
Exercises up to 4-1.1
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, P3-6, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), 5.1-{1,2}, whole Section 6.1, 6.2-{1,2,4,5,6}, 6.3-1, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Thursday, June 9, 2011

Relaxation before going on

Spent a bit of time to review a French textbook.   It's funny that the Japanese class is no longer available in my area.   So based on the rules that I will learn 2 languages at a time, I decided to pick French.  (The other would be my main thread.)

Language always make you ask deep questions about human being.  For example, does learning two languages give positive or negative effect on learning.   Would it make one better/worse in one of the language learning?

In the place where I came from, as well as many places in the world.  Students are usually exposed to bilingual teaching.   In some strange place such as Singapore and Hong Kong, students are even exposed to three languages (e.g. Singapore : Bahasa, Cantonese, Mandarin and English).  When you visit these places, there is no lack of people who can master all the languages they got exposed.   This sounds like a plus for bilingual learning.    That's also the basis why I tend to always learn two things together. 

In languages which have the same origins, for example, Italian and Spanish.  Learning the structure of the language from one or the other is also quite helpful.  For example, professoressa and professora are based on the same root.  So once you learn one of the languages, after some practice you can get another one. 

In fact, this is perhaps the basis of why textbooks usually teach students the concepts of cognates and false cognates when they teach European languages.   Because a romance language X is likely to have tons of cognates with English.  So if you are a good speaker in English, you will automatically acquire a bunch of words very quickly.   

If we believe that English can be a language which gives cognates, then why not another romance language?   That's another plus for bilingual learning.

Now are there any downsides?  From my experience, there seems to be some indeed.  To summarize it,

If the two languages are closed, then learner confusion increases.   Whereas if the two languages are distant, then learner's cognitive load will increase.

Let me first talk about the case when the two languages are closed.  e.g. Let's say the Spanish/Italian pair, when I tried to learn both together, one obvious effect is that sometimes I mixed up the answer for the languages.   Mainly because they are so closed.   To counter this issue, I have to constantly construct a 3-column table to make sure I can differentiate the subtle difference between the two languages.  

An interesting effect is that my Spanish actually turns out to become better during this process.  For example, I have stronger grasp of digit words and words because I need to make sure both sets of rules are correct in my head.  The downside of this is that I have less time to learn Italian.   So there's is indeed a trade off.    This will happen on closed language or dialect pair.   Example, I can think of including French vs Spanish, Spanish vs Portuguese, Spanish vs Italian and Cantonese vs Mandarin.   These pairs are all very closed but slightly different.

What if the language are distant?   Other than you need to spend extra time, in my opinion, the real issue is that usually the languages are not just linguistically different but they are also culturally different.

For example, what if you want to learn both English and Japanese together?  Think about it! Easy going English and the whole frigging polite system of Japanese!  This is what make learning difficult.

Though at the end of the day, I am still a mild advocate to learn two languages together.   My reason , ultimately, is this - learning a language tends to open a person's mind.   Some people say this is a Mind Hack.  With my own experience, there seems to be basis on it. 

Concrete Mathematics Chapter 1: Mathematical Induction

Mostly simple and easy to read.   Learn something new.  The part about Josephus problem is a bit more involved.

Concrete Mathematics Chapter 5: Binomial Coefficients

Hypergeometric functions and mechanical summation is beyond me.  But well, that's the consequence when I looked down "just" the binomial coefficients.  It can be a topic as deep as a field.

I don't know why. It just makes me want to learn more.

After Chapter 3

Chapter 3 and Chapter 4 are closely related.  So it is a no brainer to start Chapter 4 right after Chapter 3.  In fact, to keep up the momentum, I might just finish one exercise tonight to make sure I move on.  There are indeed many funny things to learn.

One afterthought on Chapter 3.  Chapter 3 impresses me in two ways: one is the amount of Math need in algorithm.  The other is what kind of Math is really useful.   The most important type of Math required in Computer Science is Discrete Mathematics.   It is certain that deep knowledge of Probability, Statistics and Combinatorics certainly help.  But they are not as crucial as understanding the underlying structure of a program (and deeper the argument which leads to a new program).  My deepening routine (Read 3) on this part would be to read CMath 2, 3 and 9 and TAOCP 1.2 deeply.  I feel like this can be done any time I like even when I am not seriously reading those two books.   It is just a relaxing thing to do when you want a problem to solve at night.

I expect I should be able to cruise through Chapter 4 as quick as Chapter 3. But I expect a lot of resistance in Chapter 5.   Probabilistic reasoning is never something easy to grasp.

Updated complementary exercises for CLRS 20110609

Chapter 2 General:
-Watch the MIT video if it is only about Ch2. (done)
-Write up the summary of the video. 
-Read all the solution sets after I finished the videos. 
-work out cost analysis similar to P.26 on selection sort, linear search and merge sort (tedious).

-Think how to draw the recursion tree for the insertion sort algorithm. 
-Work out all proofs for the following algorithms
  -insertion sort
  -selection sort
  -bubble sort
  -binary insertion sort
  -merge sort
  -linear search
  -binary search
-Implement 2.3.7, P2.1, Horner's rule in P2.3 and inversion calculation in P2.4.  2.3.7 has two solutions, one is trivial, one is based on merge.  Think them through.


Chapter 3 General
-TAOCP? Knuth must have talk about Big Oh at a certain point.  (done)
-There is no difference between CLRS and TAOCP (done)
-Read the Wikipedia's entries in the big Oh notation, iterative logarithm and all Math related subject. (done)
-Finish all exercises of Sedgewick Chapter 2.
-Finish all exercises below Level 25 of TAOCP Section 1.1 and 1.2 
-Read CMath Chapter 2, 3 and 9.  Finished all easy exercises.
-Alternative definitions of the asymptotic notion.
-What is the difference between the definitions of Sedgewick with CLRS?

Insertion sort:
-C, perl, python (done)
-sentinel version of straight insertion sort (done)
-binary search (done, now a separate item)
-binary insertion sort (done, now a separate item)
-recursive version of insertion sort (done)
-start to write a performance driver
-shell sort, how does the decrement business works?

   -Pratt's increments.
   -Sedgewick's increments.
-library sort
-tree sort(from wikipedia, what is it?)

-insertion sort improvement, all suggested by Exercises from TAOCP. 
-insertion sort with linked list
-How does the idea of insert leads to merge sort?
-C++,java, lisp, ASM, TAOCP Pseudo, TAOCP MIX, TAOCP MMIX.
-performance, worst case, average case
-different ways to avoid -INT_MAX as the sentinel. (state of art?)
-non-adaptive version of insertion sort.
-TAOCP Exercises in 5.2.1
-Sedgewick 6.3 Exercises

Selection sort:
-C, perl, python (done)
-C++,java, lisp, ASM
-stable version of selection sort.
-bidirectional version of selection sort.
-comparison with insertion sort.
-comparison with bubble sort
-recursive version of insertion sort
-cycle sort
-heap sort variants
     -tournament sort
     -smoothsort.
-TAOCP Exercise in 5.2.3
-Sedgewick 6.2
-Sedgewick 6.2 Exercise
Binary Insertion sort:
-C (done)
-perl, python, C++, java, lisp, ASM

Bubble sort:
-C (done)
-shaker sort
-perl, python, java, lisp, ASM
-TAOCP 5.2.2

Merge sort:
-C, perl(done)
-python, C++,java, lisp, ASM
-timsort
-performance, worst case, average case
-Improvements from Sedgewick.
-Non-recursive version or Bottom up version
-in place mergesort.
-no extra memory implementation (state of art?)
-variations of merge sort. 
-implement merge sort with k list and a k-list merging. 2.4.1.
-TAOCP Exercise in 5.4
-Sedgewick 8 (read - done)
-Sedgewick 8 Exercises.
-Shene's 5-9 Exercises. 
-where can I find problems which apply merging? 
  -Implement CLRS 2.3-7 using merging
  -like to calculate number of inversions.

Linear search:
-C, MIXAL (done)
-perl, python, C++,java, lisp, ASM
-TAOCP 6.1 and exercises.  Remember, it's not as trivial as you may think. 

Binary search: 
-C (done)
-perl, python, C++,java, lisp, ASM
-At least 100 problems which applies binary search.
   -TopCoder's exercises on binary search. 
   -Implement CLRS 2.3-7.
   -Shene Chapter 4. 
-After reading Gries, make sure you prove your own binary search is correct. 
-TAOCP 6.2 and exercises. 
-Chapter 4 and 5 of Programming Pearl exercises. 
-Sedgewick 12.4
-Fibonacci search
-Range search
-Variations of binary search. 
-Thinking: variations of binary search.  They produce different lo, up bounds. Interesting. 
-Thinking: can we combine the idea of distribution in binary search?
-Thinking: what happens if binary search is applied to a sequence with N inversions?

Inversions:
-Try to understand TAOCP 5.1 as much as I can. 

-Try to do exercises in TAOCP 5.1.

Finally finished all exercises in Chapter 3

Though there are couple of exercises I don't fully get through.  This is a very Math-heavy chapter. So I sort of expect there are couple of exercises will get me.

Math for computer science has an interesting property.  You can learn it to an arbitrary depth.  You can also learn it just a bit to get by.  The engineering aspect of algorithm will make you think you don't need any Math to get by. 

What is a similar situation?  One good one is whether you need to learn all difficult English words to read.   Do you need to learn the word silhouette to read? Do you need to learn the latin origin of every word before you can read? 

Certainly not.  But this knowledge helps subtly.  It helps when you try to look deeper. 

In any case, learning more in the Big Oh perhaps only give me marginal gain.  CMath and TAOCP 1.2.1 are perhaps the books I can read to deepen my knowledge on this part (They are fun books to read anyway).

Status 20110609(c)

Finished P3-6.  Ah. finally finished all the exercises of Chapter 3!  

Fuzzy reading: 414/1144
Browse reading: 309/1144
Exercises up to P4-1.1
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), 5.1-{1,2}, 6.1-{1,2,3,7}, 6.2-3, 6.2-4, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Status 20110609(b)

Finished on paper the whole P3-5. (a) took me some extra thinking.  I gave a fishy answer but don't want to dwell on it for too long.

Fuzzy reading: 414/1144
Browse reading: 309/1144
Exercises up to P3.5 (a)
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, P3-6, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), 5.1-{1,2}, 6.1-{1,2,3,7}, 6.2-3, 6.2-4, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

status 20110609

Finished on paper the whole P3-4.  Some of the exercises can be easily solved once my head is relaxed.

Fuzzy reading: 414/1144
Browse reading: 309/1144
Exercises up to P3.5 (a)
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, P3-{5,6}, 4.3-{1,2,4,5,6}, The whole Section 4.4, 4.5-1, P4.4(a-c), 5.1-{1,2}, 6.1-{1,2,3,7}, 6.2-3, 6.2-4, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7},  C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Video: All Questions Answered by Donald Knuth

As you might notice, I am pro-Knuth.  Whereas a lot of people feel that his books are not useful, I tend to find his writing illuminating and genuinely teach me something in algorithms.

Here is one of his talk for Google and he has a light sense of humor which makes me like him.


Even now, I think Knuth and his point of view of computer science which require usage of Mathematics and necessity of understanding the metal is still very relevant.

I am going to move on in Chapter 3 today

I stopped two days.  Tuesday for a library trip. Wednesday for cooking.  There are also something good going on at work (unusual).  So I am sort of justified to be a bit late.

I am also perhaps a bit stumbled on P3.4.  A bit mesmerized by dynamic programming.   Now I know how much I don't understand it.  Ah, it always take some time to grok. 

Tuesday, June 7, 2011

status 20110607

Finished in my head 15-1.3 and 15-3.3

Fuzzy reading: 414/1144
Browse reading: 309/1144
Exercises up to P3.4 (a)
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, P3-3, P3-4{a,c,d,e,f}, P3.5d, P3.6, 4.3-{1,2,4,5,6}, Section 4.4, 4.5-1, P4.4(a-c), 5.1-{1,2}, 6.1-{1,2,3,7}, 6.2-3, 6.2-4, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.1-3, 15.3-3, 15.5, A.1-{1,2,3,4,5,7}, C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

CLRS Chapter 15: Dynamic Programming (2nd read)

After more detail read, this time I start to realize the core of this chapter.  It is actually not that important to understand 1000 algorithms which is based on DP. Rather it is important to understand why a problem is a DP and why it is not.  So, the heart of this chapter is actually Chapter 15.3 and I found the examples about unweighted shortest simple path and unweighted longest simple path are the most illuminating.

The key of knowing when you can applying DP (which I frankly feel blurry sometimes) is that you want to know whether there is relationship between a problem and its subproblem.   You also want to know whether the subproblem is being repeatedly solved.  Both can be observed by looking at how a big problem is solved in the first place.

It's also important to understand when will a problem will become intractable polynomially. 

I think I am good for this read.  All of the examples are interesting and currently I have a feel on the rod cutting problem as well as matrix multiplication problem.  The key of these types of problem is to think through the solution yourself and try to come up with the solution.  I guess what I will add now is to always think through the problem dependency tree as well as the structure of the subproblems.

Monday, June 6, 2011

Concrete Mathematics Chapter 3

Reread the part about floor, ceiling and mod.  They are tough little operators to master but crazily useful in practice. 

There are still something I don't fully register with on the last few parts of the Chapter.  Mainly because I wasn't familiar with the Josephus problem.  Ah, it takes time.  It just takes time.  But I am working on it.

status 20110606(b)

Fuzzy read till P.414. Finished in my head 15-1.1 and 15-1.5.

Fuzzy reading: 414/1144
Browse reading: 309/1144
Exercises up to P3.4 (a)
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, P3-3, P3-4{a,c,d,e,f}, P3.5d, P3.6, 4.3-{1,2,4,5,6}, Section 4.4, 4.5-1, P4.4(a-c), 5.1-{1,2}, 6.1-{1,2,3,7}, 6.2-3, 6.2-4, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), 15-1.1, 15.5, A.1-{1,2,3,4,5,7}, C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

CLRS Chapter 15: Dynamic Programming

I wish I read this chapter before I learn all the dynamic programming algorithms in my field.  There is a very good discussion on when one should consider using DP and Greedy algorithm and one shouldn't.

I grok the rod-cutting problem.  Though the other problems require a bit more understanding.  I think I will go back to read the matrix multiplication problem if I reread this chapter.

I consider this chapter as highly important because many of the practical algorithms are either DP or greedy algorithm.  So it is a good thing to think through why they were used and how they should be used.

Status 20110606(a)

Finished P3.3.  I have thought through the problem so it is not that difficult.  Also updated Fuzzy Reading pointer to Chapter 15. I read Chapter14 but it seems to be very specific to red-black tree. Probably one of the toughest subject in the book.
Fuzzy reading: 360/1144
Browse reading: 309/1144
Exercises up to P3.4 (a)
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, P3-3, P3-4{a,c,d,e,f}, P3.5d, P3.6, 4.3-{1,2,4,5,6} , Section 4.4 4.5-1, P4.4(a-c), 5.1-{1,2}, 6.1-{1,2,3,7}, 6.2-3, 6.2-4, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), A.1-{1,2,3,4,5,7}, C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Sunday, June 5, 2011

TAOCP 3.1, 3.2

I have a lot of work related task which require random number generator.  These few chapters come in handy.

Why people think "reading" CLRS completely is impossible?

If you search for "CLRS" in Google, in Jun 2011, you would find a link which ask "How many programmers recommending “Introduction to algorithms” by CLRS actually read whole book?"

This is a valid question.  In a way, it is similar to "How many people recommending "Feynmann Lecture of Physics" actually read the whole book?" or "How many people actually read Dickens? Austen? Tolstoy? Dostoevsky?" or even "Tolkien?".   In my way of thinking, I think people who have doubt about reading CLRS is possible don't really have issues in programming and algorithmic study.   What they don't grasp is how reading and understanding works. 


Most of us can read.  The difference is mostly an issue about speed and understanding.   Speed is easy to understand,  sparing technicalities, the number of characters divided by the time you spent to read the whole book.   

But what is understanding then?  This is the core of the issues.  This is a tough problem but in many types of work, there are simple answer.   Let's work try to work out this kind of simple answer first.  For example, in fictional work, one can ask simply what is the plot of the story.   A deeper level, one can a detail chronology of all characters that include identifying their names, what they have done.   We also want to identify all their interactions.  


What is even deeper level?  That could be analysis on the text of the work (Many did that on famous book such as ... well Bible).  It could also be an intelligent guess of what the author *really* want to say.  That's what critics do.  It could be a comparison study of what this book compare with other similar books.  Does it has philosophical meaning?  Then philosophers would want to analyze its philosophical implications.  Is it about mind and body?  Monism vs Dualism?  Theist vs Atheist?    Is the book an expression of artistry?  Then a thinking of whether this book is "beautiful".   This is domain of beauty?


So what do we see here?  Understanding of a book seems to be a representation of a book in different type of knowledge.   Some representation is cruder (like just knowing the plot and the title).   Some representation is certainly more complicated and difficult to understand.  

Now a lot of contention of whether one can completely read a book lie on that it is not easy to agree what "reading" really means.   This is not just an issue of computer science books.  It is also an issue of any books.    So we should clarify that first for our subject, CLRS.   How do we define "understanding" of an algorithm?  


In fact from the example of fiction, one can see a proper definition of understanding depends on what representation of understanding should we use.   Many programmers, when they argue about this problem simply ignore this simple fact.    That's fuzzy thinking.  It's a big no-no for programmers.   What are some other types of fuzzy thinking one can think of? I can imagine couple of others.  Here is a big one.
  • "After I read the book for the first time, I will be able to use any algorithms in the book at ease."
My bet is that it is how many programmer are thinking.  After reading one single book, we become some type of super programmer and we end up able to solve all issues in work. 

Wrong.  For a book like CLRS, even if you finish reading the book and complete all exercises yourself (meaning: without any help, you can answer all the exercises).   You still need to learn.   There are parts of CLRS need to be complemented by other books.   Its sorting section is known to be less practical than Sedgewick AIC,  its theoretical depth is known to be shallower than TAOCP.  

I would say if you studied CLRS by heart, you would know around 100-200 standard algorithms.   But you don't exactly know all of their variations.   It's also impossible for *you* to change them because you don't necessarily understand why they survive these days.  (Because their underlying principle is sound).   By the definition, you need at least another read. 

Back to the problem of understanding CLRS.  So what can we say about what is understanding of an algorithm.   One thing we established so far is that understanding is actually not just a simple "BEEEB" in our brain, then we understand.   Understanding is a complicated object.   We also see from the fiction example that understanding can mean different things to different people.   So I will tell you my way of understanding the book.   In the past, I have outlined a post titled Current Reading Methodologies for CLRS.   I wrote that post with this thinking in mind.
  • Level 1: knowing the algorithms and the basic terminology to describe the algorithm.   e.g. you need to know the pseudocode of the insertion sort.   In big O, it is a O(n^2) algorithm. 
  • Level 2: you can finish the exercises related the books.  Why?  If there were no exercises, all of us should simply jump to Level 3.  That is to establish connections in between all related terms you learn.   But exercise is an accelerated method.  Especially in a well written book such as CLRS. Once you finish one exercise, you should have a feeling that you either know how to apply some material in the text.  Or you are able to connection between some similar unrelated concepts.   Or you are able to learn a new property of a certain object.   
  • Level 3: Establish connection on all the things we human know.   For example in the case of big Oh.  Can you differentiate it with soft Oh (The one which doesn't differentiate exponential difference).   When you learn insertion sort, do you know its different variations?  This starts to be a deep level because there are many possibility to connect.  My "complementary exercises" are essentially about that.  At this point, I am still trying to understand different variations of insertion sort.  Sounds silly?  Of course, if you ask me other slightly advanced things such as DP and graph algorithm, I also have working knowledge.   It's just that they are not as deep.  This is also a point you want to read other books as well.   So even though I am reading CLRS,  I read Sedgewick ,  CMath, TAOCP and Programming Pearls. 
  • Level 4:  This starts to be the domain of research.   What are the issues of in this area?  This includes couple of things. 
    • What are the interesting new problems in this domain?
    • What are the viewpoints in this domain?
    • What are the unsolved problems in this domain?
    • Is there any thing in this topic is "useful"?  e.g. so efficient one should let every one should know?
    • This is the moment you should feel confused. 
  • Level 5:  Try to think of some solutions for the questions you have in Level 4.  You might or might not able to think of any.   If you do, don't waste it.  Publish it.  Tell other people.  
Now here is the first question you will have if you read this far.   "Does that mean it takes a long time to get to Level X of understanding?"  Of course.  No pain no gain, my bro.  Another more important question you should ask as well is "Does that mean it only take skimming to finish level 1 of understanding?"  That's certainly true too.  Some programmers who did cursory look of the book in the first few chapters only take 1-2 months.  So it is not such a long time.  Not everyone needs everything in a huge algorithm book.   But reading it is still highly recommendable because even just a tidbit of this knowledge will help you to cruise through a lot of daily technical problems in your work.

So, in conclusion, I still whole-heartedly recommend you to read CLRS through.   I, for one, has a cursory read until Chapter 15 in the last 2 months (what I called "fuzzy" read).  I believe that means at around September, I will finish the so called "fuzzy" read in this September.   My exercise read should be finished around 1.5 to 2 years later.   But I still plan to finish it because just finishing the first two chapters give me a feeling that I have tremendously improved in programming.  I think you can also be benefited in the process.

Updated complementary exercises for CLRS 20110605

Chapter 2 General:
-Watch the MIT video if it is only about Ch2. (done)
-Write up the summary of the video. 
-Read all the solution sets after I finished the videos. 
-work out cost analysis similar to P.26 on selection sort, linear search and merge sort (tedious).
-Think how to draw the recursion tree for the insertion sort algorithm. 
-Work out all proofs for the following algorithms
  -insertion sort
  -selection sort
  -bubble sort
  -binary insertion sort
  -merge sort
  -linear search
  -binary search
-Implement 2.3.7, P2.1, Horner's rule in P2.3 and inversion calculation in P2.4.  2.3.7 has two solutions, one is trivial, one is based on merge.  Think them through.


Chapter 3 General
-TAOCP? Knuth must have talk about Big Oh at a certain point.  (done)
-Read the Wikipedia's entries in the big Oh notation, iterative logarithm and all Math related subject.
-Alternative definitions of the asymptotic notion. 
-Finish all exercises of Sedgewick Chapter 2.
-Finish all exercises below Level 25 of TAOCP Section 1.1 and 1.2 
-CMath Chapter 9, all warm up and beginner exercises. 
-What is the difference between the definitions of Sedgewick with CLRS?

Insertion sort:
-C, perl, python (done)
-sentinel version of straight insertion sort (done)
-binary search (done, now a separate item)
-binary insertion sort (done, now a separate item)
-recursive version of insertion sort (done)
-start to write a performance driver
-shell sort, how does the decrement business works?

   -Pratt's increments.    -Sedgewick's increments.
-library sort
-tree sort(from wikipedia, what is it?)

-insertion sort improvement, all suggested by Exercises from TAOCP. 
-insertion sort with linked list
-How does the idea of insert leads to merge sort?
-C++,java, lisp, ASM, TAOCP Pseudo, TAOCP MIX, TAOCP MMIX.
-performance, worst case, average case
-different ways to avoid -INT_MAX as the sentinel. (state of art?)
-non-adaptive version of insertion sort.
-TAOCP Exercises in 5.2.1
-Sedgewick 6.3 Exercises

Selection sort:
-C, perl, python (done)
-C++,java, lisp, ASM
-stable version of selection sort.
-bidirectional version of selection sort.
-comparison with insertion sort.
-comparison with bubble sort
-recursive version of insertion sort
-cycle sort
-heap sort variants
     -tournament sort
     -smoothsort.
-TAOCP Exercise in 5.2.3
-Sedgewick 6.2
-Sedgewick 6.2 Exercise
Binary Insertion sort:
-C (done)
-perl, python, C++, java, lisp, ASM

Bubble sort:
-C (done)
-shaker sort
-perl, python, java, lisp, ASM
-TAOCP 5.2.2

Merge sort:
-C, perl(done)
-python, C++,java, lisp, ASM
-timsort
-performance, worst case, average case
-Improvements from Sedgewick.
-Non-recursive version or Bottom up version
-in place mergesort.
-no extra memory implementation (state of art?)
-variations of merge sort. 
-implement merge sort with k list and a k-list merging. 2.4.1.
-TAOCP Exercise in 5.4
-Sedgewick 8 (read - done)
-Sedgewick 8 Exercises.
-Shene's 5-9 Exercises. 
-where can I find problems which apply merging? 
  -Implement CLRS 2.3-7 using merging
  -like to calculate number of inversions.

Linear search:
-C, MIXAL (done)
-perl, python, C++,java, lisp, ASM
-TAOCP 6.1 and exercises.  Remember, it's not as trivial as you may think. 

Binary search: 
-C (done)
-perl, python, C++,java, lisp, ASM
-At least 100 problems which applies binary search.
   -TopCoder's exercises on binary search. 
   -Implement CLRS 2.3-7.
   -Shene Chapter 4. 
-After reading Gries, make sure you prove your own binary search is correct. 
-TAOCP 6.2 and exercises. 
-Chapter 4 and 5 of Programming Pearl exercises. 
-Sedgewick 12.4
-Fibonacci search
-Range search
-Variations of binary search. 
-Thinking: variations of binary search.  They produce different lo, up bounds. Interesting. 
-Thinking: can we combine the idea of distribution in binary search?
-Thinking: what happens if binary search is applied to a sequence with N inversions?

Inversions:
-Try to understand TAOCP 5.1 as much as I can. 

-Try to do exercises in TAOCP 5.1. 



Appendix A:
-Read CMath Chapter 1 and 2. Finished warmup and beginner exercises.

Status 20110605(a)

Fuzzy Read till Chapter 15.   The part about Red-Black Tree is a little bit too tough to absorb in one shot.  So I decide to work on it later. DP, greedy algorithm happen to be used in my work a lot.  So Chapter 15 and 16 will likely give me a break. What I want to focus though is the underlying theory of these material.   CLRS is likely the best text book out there which talks about these stuffs.  (Or we have to wait till the whole V4 and V5 of TAOCP to be published......)

This reminds me, may be I should buy a copy of TAOCP V4 soon.

Fuzzy reading: 339/1144 (Until the beginning of Chapter 15 for 2nd Edition.)
Browse reading: 309/1144
Exercises up to P3.3
Needs write up: P2.1c, 3-2.4, 3-2.8, P3.1
Exercises which are finished but not on paper: 1.1-2, 1.2-2, 1.2-3, P1-1, P3-3, P3-4{a,c,d,e,f}, P3.5d, P3.6, 4.3-{1,2,4,5,6} , Section 4.4 4.5-1, P4.4(a-c), 5.1-{1,2}, 6.1-{1,2,3,7}, 6.2-3, 6.2-4, 10.1-6, 10.1-7, 12.1-{1,2,4} 12.2-{2.3}, 12.4-4, P12.4(a), A.1-{1,2,3,4,5,7}, C.1-{2,4,5,7,8,9,14}, C.2-{1,2,4,5}, C.3-{1,2,10}, all in D.1. and D.2-2

Reread Chapter 5 and 11

These are the relatively tough chapters in CLRS because they require wielding of mathematics which include probability and statistics.   I think this is probably my third or forth read.   There are still material I don't feel I am connected with.   They are not very difficult.   It's just take some time to think them through. 

If you think about it, hash table is a very different device we made.   On the bottom of it, it relies on average case performance as compared to worst case performance analysis we used in other algorithmic device. 

Saturday, June 4, 2011

Our lifes as programmers.

Sometimes you feel that you are going no where.  Sometimes you feel you hit blocks which cannot be resolved.   It is not because you haven't tried.  No, I did, you try to be maximally capable, friendly and cooperative.   The response is indifferent, cold and condescending.   This is a disappointment.  Because sometimes, no matter you tried, you will still have moment of bad lucks.   Or simply bad beats as poker players called it.  Those are the moments you have done the best you can do, but bad things just come to you.
  • Your algorithm is all correct but the input just keep on changing.  You become a programmer with a lot of bugs.   
  • In a programming task, your decision was correct but your boss said no to your decision and insist his.  You gain bad reputation in the process.  
  • You investigated a certain problem lower than the bottom.   Your boss heard someone else without carefully considering your arguments.   Not only you need to use a useless method to work on a problem you know how to work out again, you also waste the time of the first investigation.  
  • Your colleagues need your help.  You help them.  But your boss said you are not being putting your A-game into your tasks.   
  • At the end of the cycle, you were suddenly asked to implement brand new features which are all brainstormy in nature.   There is no time to test them.   If you don't do it, your job is at risk.  If you do it, you are killing your colleagues.
  • Your boss to use a college friend of his or a relative of a director.   He asked you to "manage" the guy.  If something good happens, the merit belong to that guy.   If something bad happens, you need to be responsible.
Sounds familiar?  Some people called it Dilbertesque.  In our world though, it is actually a reality.  This is simply what happens when there is a hierarchical structure of a programming work place.   Believe it or not, these kind of workplace still exist. Unfair judgments come from the top and bad decisions propagate through out the group or the company.  You, as a programmer,  has done what you can do to minimize the chance something bad happens.    What can you do?

The first thing you should do is to understand that this is not your fault.   A pair of AA can get beat by Ax in Texas Holdem which every players know, it is a 93 to 7 shot.   Sometimes you play well but you don't get pay well.  That's just the way life is.   Your boss, your peers as imperfect human beings, sometimes just make mistake.   It is you to decide how to deal with this imperfect future. 

The second thing is to make sure you still play well in the game.   Don't go to drink because you are not happy for the day.   Don't go to take drugs, smoke because you feel those are relieving things to do.   They are counter-productive.   Don't get emotional about these issues.   If your boss or your colleague are obnoxious, eschew them.  Make sure you make sure everything is correct in your domain.   Control the outcome you can control.

The third thing to do is to keep on learning.  There is always a better job waiting for you.  Our discipline favor smart and hard-working people.   There is always a better life outside the job you are working on.   Every time you can get such a chance, you should ask for a salary adjustment.   Why? Because we are hot.  If you don't have a reason to stay in a place, leaving is not a bad idea.  The best people in this country (also outside U.S.) move around.  In that way, you can get benefit out of your jobs.

But remember, moving around different companies are not easy in our industry.  You need to be good.  So study, learn.   Here the things I recommend you to learn:
1, Languages, Communications, Writing,
2, Math and basic computer science subjects.  Top 10: Algorithms, Operating Systems, Compiler Design, Database, Discrete Mathematics, Probability and Random Process,  Linear Algebra, Complexity Theory, Calculus,  Machine Learning. 
3, Management and Interpersonal Skill. 
4, Musical Instruments,
5, Art, 
6, Martial Arts.

1 and 2 are essential.  All of them, when learn seriously, can open your mind.  Being able to impress other people means a lot.  3 are accumulative.  You just need to work and reflect some time.  4-6 will open your eyes.

The final thing to do is to treat yourself well.   Enjoy the situation.   You have only 1 life.  So live happily,  see the adversity you are facing as a challenge, as a problem solving opportunity.   You just need to get 1 good chance, if you are smart and hardworking, your work will get paid off one day.