Summary of Discrete Distributions for CS 237

These notes are intended to supplement the textbook and lectures in CS 237 in order to help you analyze and solve problems in the second part of the course (on Random Variables). We are still developing a "toolbox" of various techniques for solving classic probability problems, but now the focus is wider, and we are looking at the overall outcomes of a random experiment, and formalizing in terms of functions which return real numbers. To this point, we have only considered discrete distributions, i.e., those where the range of X is finite or countably infiinite.

For each of these distributions, I will provide a brief motivation, give a canonical problem which this distribution solves, describe the random variable and the formula for the PMF, list the statistical properties, and then give illustrations of the distribution for insight.

When the standard deviation is simply the square root of the variance, it will not be listed.

 

Uniform Distribution: U(a,b)

Motivation: The uniform distribution describes the equiprobable case where X takes on sequential integer values from a to b inclusive.

Definition:  X ~ U(a,b) when

        Rng(X) = { a, ..., b }
        P(X=k) = 1/(b-a+1)

Canonical Problem: Throw a single die. What is the probability that k shows, for 1 ≤ k ≤ 6?

Example: Throw a single die; probability that a 4 shows is 1/6.

             S = { 1, 2, 3, 4, 5, 6 }
             P = { 1/6, ..., 1/6 }          

Statistics:

        E(X) = (a+b) / 2
        Var(X) = [(b-a+1)^2 - 1] / 12
              

Illustrations:

 

Bernoulii Distribution: Bernoulli(p)

Motivation: The Bernoulli distribution describes the outcomes of a single trial with two outcomes, success or failure, where p = probability of success.

Definition: If 1 = success and 0 = failure, then X ~ Bernoulli(p) when

        Rng(X) = { 0, 1 }
        P(X=0) = 1 - p
        P(X=1) = p   

Canonical Problem: Flip a (possibly unfair) coin, where the probability of a head is p. Observe whether a head appears.

Example: Toss a fair coin; the probability of heads is 1/2.

             S = { T, H }       
             P = { 1/2, 1/2 }          

Statistics:

        E(X) = p
        Var(X) = p(1-p)

Illustrations:

 

Binomial Distribution: B(N,p)

Motivation: The binomial distribution describes the number of successes which occur among N independent Bernoulli trials.

Definition:  X ~ B(N,p) when

        Rng(X) = { 0, ..., N }
        P(X=k) = C(N,k) * p^k * (1-p)^(N-k) 

Canonical Problem: Throw a (possibly unfair) coin N times. What is the probability that k heads appear?

Example: Throw a coin 5 times and count the number of heads; the probability of 2 heads is 10/32.

             S = { 0, 1, 2, 3, 4, 5 } 
P = { 1/32, 5/32, 10/32, 10/32, 5/32, 1/32 }

Statistics:

        E(X) = N*p
        Modes(X) = floor( (N+1)p - 1 ),  floor( (N+1)p )         // if  (N+1)p is not an integer, then only the right mode exists 
        Var(X) = N*(1-p)*p
              

Illustrations:

Poisson Distribution: Poi(λ)

Motivation: If we have a process in which events arrive (hence, called arrivals) independently through time, with some mean rate λ = # arrivals/unit time, then the Poisson characterizes how many arrivals will occur in a randomly chosen time unit.

Definition: X ~ POI(λ) if

        Rng(X) = { 0, 1, ...  }        // countably infinite
        P(X=k) = 
                  

where e = 2.71828183.... (Euler's constant). 

Canonial Example: If emails arrive in my inbox on average 5 times an hour, what is the probability that in the next hour, only 4 emails will arrive?

        time unit = 1 hour, λ = 5, k = 4
  
        Solution: (5)^4 * e^-5 / 4!  = 0.1755. 

Statistics:

        E(X) = λ
	Modes(X) = floor(λ)-1, floor(λ)        // if λ < 1, then there is only one mode,  at floor(λ) 
        Var(X) = λ

                           

Illustrations: (using different values for λ; range is infinite, so I show just up to some limit)

 

Geometrical Distribution: G(p)

Motivation: This counts the number of Bernoulli trials until the first success occurs; more generally, it describes discrete quantities that decrease exponentially over time or space.

Definition: X ~ G(p) if

        Rng(X) = { 1, 2, ... }
        P(X=k) = (1-p)^(k-1)*p                //  Pattern  FFF....FS   where there are k total trials
                  

Canonial Example: What is the probability when flipping a fair coin that it takes 4 flips to get the first head?

        Solution: (1/2)^3 * (1/2)  = 1/32

Statistics:

        E(X) = 1/p
        Mode(X) = 1
        Var(X) = (1-p)/p^2

                           

Illustrations: (range is infinite, so I just show up to some limit)

 

Multinomial Distribution: M(N,X)

Motivation: The multinomial distribution generalizes the binomial to trials in which there are more than two outcomes.

Definition: Suppose we have a finite discrete random variable X with Rng(X) = { s1, ..., sk } and probability measure p = { p1, ..., pk }, i.e., such that P(X=si) = pi. Fix some number of trials n, and define a vector of random variables [ X1, ..., Xk ] where Xi = the number xi of occurrences of si in the n trials:

        Rng( X1, X1, ..., Xk ) = { [ x1, ..., xk ]  | all xi > 0 and x1 + ... + xk  = n }

        P(X1=x1, X2=x2, ..., Xk=xk) =  
                  

Canonical Problem: Throw a die n times. What is the probability that the vector of occurrences of each of the numbers 1 ... 6 is [ x1, ..., x6] where x 1 + ... + x6 = n?

Example: Throw a die 8 times; what is the probability of one each of 1 .. 4, two 5's, and two 6's?

        8!/(2!*2!) * (1/6)^1 * (1/6)^1 * (1/6)^1 * (1/6)^1 * (1/6)^2 * (1/6)^2 
      = 8!/(2!*2!) * (1/6)^8
      = 0.0060

Statistics:

        E(Xi) = N*pi
        Var(Xi) = N*(1-pi)*pi            

[To illustrate would require higher-dimensional graphs, so we will omit!]

Hypergeometrical Distribution: H(N,K,n)

Motivation: The Binomial is selection with replacement. The Hypergeometrical Distribution generalizes the bionomial to the case of selection without replacement.

Definition: Suppose we have N balls in an urn, of which K are red and N-K are black. Draw n ≤ N balls from the urn without replacement.

Then X = the number of red balls drawn from the urn in n trials. If k red balls are drawn, then clearly n-k black balls are drawn; thus:

        Rng(X) = { 0, 1, ..., min(K,n) }
        P(X=k) = 
                  

Canonial Example: From a standard deck of cards (with 13 Spades), draw a 5-card hand; what is the probability that 3 of these are Spades?

        Solution: C(13,3) * C(39,2) / C(52,5) = 0.0815

Statistics:

        E(X) = n*K/N
        Mode(X) = floor( (n+1)*(K+1) / (N+2) )
        Var(X) = 

                           

Illustrations: (starting with set of 5 balls, 3 red and 2 black)

 

Negative Binomial: NB( r , p )

Motivation: The negative binomial generalizes the geometric to count the number of trials until r successes.

Definition: X ~ NB(r,p) where r = number of successes to wait for and p is probability of success on a random trial:

        Rng(X) = { r, r+1, r+2, .... }                  
        P(X=k) = C(k-1,r-1) * (1-p)^(k-r) * p^r    //  Pattern:  ..F..S..FS   where S occurs r times
                  

Canonial Example: Suppose team A and team B play a series, which is over when one team has won 4 games. Team A has a 75% of winning any particular game. What is the probability that team A wins in 6 games?

        r = 4, k = 6

        Solution: C(5,3) * (1/4)^2 * (3/4)^4 =  5/32 = 0.1978.

Statistics:

        E(X) = r/p
        Var(X) = r(1-p)/p^2

                           

Illustrations: (range is infinite, so I'll just show up to some limit)