Here is my exchange with a student, which I think might be useful to others (the bottom paragraph is the most important):
> -----Original Message-----
> > > Hi Professor, Problem 11.2-1 seems deceptively simple, I think I may
> > > be reading wrong into it. My reasoning is that if given uniform hashing
> > > we have each element landing into specific spot. So if there are >= slots
> > > there shouldn't be any collision. Otherwise #of collisions is simply
> > > #elements - #slots. Am I missing something here, this one is
> > > bugging me. Thanks in advance.
> > Indeed, say, you have exactly the same numbers of slots and elements. Then
> > assuming you had no collisions up to the last element, out of n possible
> > outcomes of the hash function only 1 is non-collision. So the probability of
> > that is 1/n, and with probability 1-1/n you will have a collision.
>> (so, if n=100, the probability of collision would be at least 99%)
> > I think you are misunderstanding what uniform means: re-read it in the book,
> > but intuitively, it means that each element can be hashed into any slot with
> > the same probability (for each element, imagine rolling a large enough dice
> > to determine which slot it will get into).
> Load factor A would indicate how "overloaded" a slot is on the average.
> So, is A = 1.3 there's 1.3-1=.3 "extra" element in the slot-chain on the
> average. So if we have M slots than there are M*.3 collisions. Or
> (n/m - 1)*m=n-m
> Am I overlooking some cases?
I think you understand the (over)load correctly. But still, try to plug in
some simpler numbers - if you say that the load factor is 1, then your
argument would mean that there are no collisions, but in the previous email
I argued that still collisions are very likely even in this case. In fact,
the so called birthday paradox says that at least one collision is very
likely even if the size of the table m is m=n^2, i.e. the load factor is
So, what you need to do for this problem is to carefully write down the
probability of a collision for each pair of input values, and then look up
the expected value definition and try to write it for this case.