The pains of an average programmer

The last time I attempted a programming contest was on 28th Feb when I was trying to be a purple on codeforces. After a long break I thought of attempting codechef cookoff yesterday and I did.

The first problem was a cake walk. Given co-ordinates of triangles the task is to find how many of them were right-angled.

Based on popularity I chose to solve DIVQUERY next. The task is to find how many numbers in an array (size :10**5) between 2 the given indices (Ex: all values in the array between 2nd and 10th element ) are divisible by a given number K. There can be upto 10**5 such test cases. This looked difficult and almost impossible to solve.

After going through the constraints I observed that the values in the array are less than 10**5. After some thought I understood that this could be solved using some kullu thelivithetalu (street smartness). I estimated that the number of factors any number less than 10**5 can have is less than 100. 2*3*5*7*11*13*17 > 10**5 (miscalculation-1). This means that no number less than 10**5 can have more than 64 (2**(7-1)) factors.

For each number in the array I need to store the index for all its factors. For a given number this store can give me indices of all its multiples in the array. As each number can have a maximum of 64 factors the total space this store requires would be 64* 10**5 (<10**7) which is perfectly fine.

What data-structure to use. I have 2 options

  • An array of vectors: Where the ith vector gives the indices of multiples of ith value in the input array. But initializing 10**5 vectors might be very slow and I never liked vectors and could not convince myself about their amortized constant time inserts. Also as it is not possible to have a good estimate of expected size of each vector I cannot use the reserve space option. Option ruled out (miscalculation-2)
  • Map<long,int> : Where the key is actually a pair .
    num = key%1000000
    index  = key/1000000
    

    So the touple (num,index) is the key and the value is the number of multiples of num in the array upto indexth value in the array. Expected number of inserts is 10**7 and expected time is 2* 10**8 .This is perfectly fine. Map it is

Implementation comes next. At first I need to get all the keys that of the map and store them somewhere. Once this is done the required map can be built by iterating over the keys in ascending order.

Attempt-1:
Store all the keys in the map with some dummy value and give them the correct values during the iteration. Java does not allow modifying a data structure while iterating over it. C++ might be doing the same. So store all the keys in a set and then build the map by iterating over the set.

for (long i = 0; i < n; i += 1){
   scanf("%lld",&x);
   for (long j = 0; j < cnt[x]; j += 1){
      long num = factors[x][j];
      st.insert(i + 1000000l*num);
   }
}
long presNum=0,presValue=0;
set::iterator it;
for(it = st.begin(); it != st.end() ; it++){
   long num = (*it)/1000000l,ind=(*it)%1000000l;
   if(num==presNum){
      presValue++;
   }
   else {
      used_cnt[presNum]=presValue;
      presNum=num;
      presValue=1;
   }
   mp[*it]=presValue;
   used[num][presValue-1]=ind;
}
used_cnt[presNum]=presValue;
 
for (long i_ = 0; i_ < tc; i_ += 1){
   long l,r,k; scanf("%lld %lld %lld",&l,&r,&k);
   l-=2;r--;
   long l1=-1,r1=-1;
 
   for (long i = 0; i < used_cnt[k]; i += 1){       
      long i1 = used[k][i];       
      if(l>=i1 && r>=i1) {
         l1=i1; r1=i1;
      }
      else if(r>=i1) r1=i1;
      else break;
   }
 
   long ans =0;
   if(r1>=0) ans = mp[r1+1000000l*k];
   if(l1>=0) ans = ans-mp[l1+1000000l*k];
   printf("%lld\n",ans);
}

Result: TLE

Attempt-2:
Java is very strict. C++ is very loose and may be it does not do the locking business while iterating. After some digging I discovered that this is the case. So the set is not required.


for (long i = 0; i < n; i += 1){
   scanf("%lld",x);
   for (long j = 0; j &lt; cnt[x]; j += 1){
      long num = factors[x][j];
      mp[i + 1000000l*num]=1;
   }
}

long presNum=0,presValue=0;
map<long,long>::iterator it;

for(it = mp.begin(); it != mp.end() ; it++){
   long num = (it->val1)/1000000l,ind=(it->val1)%1000000l;
   if(num==presNum){
      presValue++;
   }
   else {
      used_cnt[presNum]=presValue;
      presNum=num;
      presValue=1;
   }
   mp[it->val1]=presValue;
   used[num][presValue-1]=ind;
}

Result: TLE

Attempt-3:
May be I’m loosing very narrowly and some minor modifications might result in AC. So I better modify the key to (num<<20) + ind. Getting back num, index from key will be key>>20, key&((1<<20) -1). After lot of pain with bracketting I ended with the code is the best I can do.


for (long i = 0; i < n; i += 1){
   ni(x);
   for (long j = 0; j < cnt[x]; j += 1){
      num = factors[x][j];
      //cout<<num<<"-"<<(i + (num<<20))<<" ";
      mp[i + (num<<20)]=1;
   }

}

long presNum=0,presValue=0,ind,r1;
map<long,long>::iterator it;
long __num = ((1<<20)-1);
for(it = mp.begin(); it != mp.end() ; it++){
   num = (it->val1)>>20,ind= ((it->val1)& __num) ;
   if(num==presNum){
      presValue++;
   }
   else {
      used_cnt[presNum]=presValue;
      presNum=num;
      presValue=1;
   }
   it->val2=presValue;
   used[num][presValue-1]=ind;
}
used_cnt[presNum]=presValue;

Result: TLE

Attempt-4:
What if all the test cases are 1,10**5,1 ? My solution would blow off. Use binary search. I always prefer not to use the upper_bound and lower_bound functions. But now I have to.

l1 = *(upper_bound(used[k],used[k]+used_cnt[k],l)-1);
r1 = *(upper_bound(used[k],used[k]+used_cnt[k],r)-1);

Result: TLE again. Screw this . Its not gonna happen and there is not much I can do as the other questions have to be coded with lot of care and I will definitely mess up as there is less than 20 mins to go . Some fb. Sudden thought : Why not try the vector idea? Nothing to lose.

Attempt-5:
Code code code. I implemented the vetor idea.

for (long i = 0; i < n; i += 1){
   ni(x);
   for (long j = 0; j &lt; cnt[x]; j += 1){
      v[factors[x][j]].pb(i);
   }
}
long l,r,k,l1,r1;
while(tc--){
   scanf("%lld %lld %lld",&l,&r,&k);
   l1=(upper_bound(v[k].begin(),v[k].end(),l-2) -v[k].begin());
   r1=(upper_bound(v[k].begin(),v[k].end(),r-1)-v[k].begin());

   printf("%lld\n",r1-l1);
}

Result: Wtf? Where is the submit button? Contest over! The question is added to practice. Submitted my solution and got a WA. I thought I messed up some where.

I opened the editorial and after a quick glance I saw the number 128. Then I realised my miscalculation-1 and then modified the declaration of factors to factors[100001][150] (from factors[100001][100]) and the result is an AC.

Counting the number of set bits in an integer

Most natural ways to do this is to loop through all bits in the integer increment the count for each set bit.

int NumberOfSetBits(int n){
  int count = 0;
  while(n){
    count += n & 1;
    n >>= 1;
  }
  return count;
}

But we can do better. The following code works perfectly fine for counting the number of bits set.

//for 64 bit numbers
int NumberOfSetBits64(long long i)
{
    i = i - ((i >> 1) & 0x5555555555555555);
    i = (i & 0x3333333333333333) +
        ((i >> 2) & 0x3333333333333333);
    i = ((i + (i >> 4)) & 0x0F0F0F0F0F0F0F0F);
    return (i*(0x0101010101010101))>>56;
}
//for 32 bit integers
int NumberOfSetBits32(int i)
{
    i = i - ((i >> 1) & 0x55555555);
    i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
    i = ((i + (i >> 4)) & 0x0F0F0F0F);
    return (i*(0x01010101))>>24;
}

The magic numbers might seem very confusing. But they are not random (obviously!).
Let us start with 0X55555555. Binary representation of 5 is 0101. So 0X55555555 has 16 ones, 16 zeros and the ones,zeros take alternate positions. Similarly 0X33333333 has 16 ones, 16 zeros and 2 consecutive ones, 2 consecutive zeros alternate.

F = 0x55555555 = 01010101010101010101010101010101
T = 0x33333333 = 00110011001100110011001100110011
O = 0x0f0f0f0f = 00001111000011110000111100001111

Now split the number into 16 sets of 2 bits
Ex: 9 = 1001 = 00,00,…..00,10,01
First step that algorithm does is to count the number of set bits in each of these 2-bit numbers. As each 2-bit number can have a maximum of 2 set bits. Each of these counts fit into a 2-bit number. That is what the below snippet does.

i = (i & 0x55555555) + ((i >> 1) & 0x55555555);

‘i & F’ preserves only the odd positioned ones. (i>>1) & F preserves only the even positioned ones and gets the even ones to odd positions. Adding them results in each pair of 2-bits having the number of set bits in their corresponding positions.
Similarly the after next 2 steps we end up having a number which can be seen as 4 8-bit numbers whose sum is the required count.

i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
i = (i & 0x0F0F0F0F) + (i >> 4)) & 0x0F0F0F0F);

Multiplying this number with 0X01010101 will result in a number whose most significant 8 bits contain the required count. Finally on summing up all these steps we get the following code.

int NumberOfSetBits32(int i)
{
    i = (i & 0x55555555) + ((i >> 1) & 0x55555555);
    i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
    i = (i & 0x0F0F0F0F) + (i >> 4)) & 0x0F0F0F0F);
    return (i*(0x01010101))>>24;
}

Here some more micro optimizations are possible. After giving some thought it can be understood that the 2 lines of code in the below snippet produce the same result.

i = (i & 0x55555555) + ((i >> 1) & 0x55555555);
i = i - ((i >> 1) & 0x55555555);

In general ‘&’ operator is not distributive over ‘+’. But in this case it does. In i and i>>4, each of the eight 4-bit numbers has a maximum value of 4 (fits in 3 bits). And adding these numbers will not cause an overflow in any of the 4-bit slots. So it works in this case. (Think of why it will not work int the previous steps).

i = (i & 0x0F0F0F0F) + (i >> 4)) & 0x0F0F0F0F);
i = (i & + (i >> 4)) & 0x0F0F0F0F;

So finally we arrive at the code for NumberOfSetBits32 as in the 2nd snippet ( NumberOfSetBits64 can be implemented in similar lines).

Resolutions 2013

No Shit: facebook, quora
Last year I got completely addicted to  facebook and quora.  My hands couldn’t be still on seeing status updates of friends in my close circle. I could not resist myself from not sharing(on facebook) anything I find interesting. This is good upto a certain extent , but everything has a limit and I crossed it. Finally I became the guy who always has fb open and scrolls down and down till it takes practically infinite time to load the feed.

Quora says it connects me to everything I want to know about. That is true. But it connects to many other things I don’t need at all. If not facebook in the second half of 2012 it was quora I was browsing all the time. The bad thing about quora is that , it gives me a false feeling that I’m doing some thing useful and that it is not a waste of time. I spent almost 4 hrs/day on quora in week days and even more in weekends.

Twitter is an exception and remains so. I was never addicted to twitter. One good thing about twitter is most of my friends are not active on twitter and nothing inspires me to write a comment (reply). I’m nothing more than a spectator. It is the same with quora as well. But the answers/posts are generally long. It eats a lot of time.

So ‘no shit’ boils down to no posts on facebook, not more than a couple of hours per week on quora. So from now on fb is just a platform for chatting and poking.

Github/BitBucket
Why do I need github/BitBucket

  • I always backed up code on my external hard disk. I never felt the need for an online backup. But after losing my hard disk , everything is gone. Suddenly I realized the need for an online backup.
  • Every company hosts their code on some content tracker. I never cared to know about them properly. Getting familiar with them might avoid some pains  in my professional life.
  • I was such a fool to not know that git is not just about github and gitbub one of the many online services that host git repositories.

From this year all not-shitty, non-proprietary code I write will be on github/bitbcket.

Coursera/Udacity
How should I spend the time I gain by giving up facebook and quora? Coursera/udacity is the answer.  Coursera has some very  nice and interesting courses taught by professors from top universities.

Here is my coursera wish list . The target is not to complete all these courses.  It is difficult to put in continuous,  consistent efforts for them (coursera expects that). Also there is no point in solving assignments on topics with which I’m super familiar. Learning is the only target.

Lose Weight
This needs no explanations!

Python for course assignments

Python is a beautiful programming language . The code looks very neat and verbal. To give it a try I chose to do my course assignments in python. Here are the advantages of using python over other languages for the course assignments.

  1. Not many students use python for assignments. This is a big advantage because if you get stuck in the assignments , you can borrow your friend’s code (c++/java) and blindly covert it line by line to python. You don’t have to worry about instructor’s warning on copying course assignments.
  2. The code looks very clean. It forces code indentation which is a good programming practice.
  3. Every damn thing is already implemented in the language and the documentation is pretty good.
  4. Java libraries can also be used with python via jython.
  5. Python might be very slow compared to C++/java. But speed is never an issue for course assignments. So it should not be a problem!

Happy python’ing !