CS 237 - Probability in Computing

Fall , 2015

Instructor and Lecture

Teaching Fellow and Lab Sections

Instructor:Wayne Snyder
      Email: waysnyder at gmail dot com
      Office: MCS 147
      Office Hours: T 1:30-2:30 W 2:30 - 4:30 R 12:30 -2:00

      Cell Phone: 617-966-(10^2+41) (email vastly preferred, but if you call, leave a message!)

Teaching Fellow: Hannah Flynn
      Email: hmflynn at bu dot edu

     Office: PSY 221
      Office Hours: T 2-3 in CS Ugrad Lab (above Radio Shack)

Discussion Sections (all on Mondays)
1-2pm in MCS B21

     A3: 2-3pm in PSY B49
      A4: 3-4pm in MCS B25


  • Solution to midterm is posted below and here: PDF
  • Here is a brief summary of the problem solving techniques we studied in the first part of the course: HTML
  • Here is a brief summary of the important discrete distributions we have studied: HTML
  • Here are some midterm statistics:

Here is a scatterplot of the joint random variable (X,Y) where X = midterm grade and Y = time to complete the test, with the marginal distributions; not much of a correlation either positive or negative!

Useful Links

Lecture & Lab Topics
Readings, from DeGroot unless otherwise noted
Homeworks and Tests
1 R 9/3 Administrative matters; Goals of the course; Motivating Examples: Why should a computer scientist know probability and statistics?

Recommended: Chapter One from "The Drunkard's Walk": PDF

Here is a link to an article on traffic deaths after 9/11, with slightly different numbers than I presented: HTML

Here is the wiki page explaining the "Base Rate Falacy" (the breathalyzer example at the end of class): HTML

Just for Fun: Here is a short YT video with various clips from movies which involve probability (we will return to the Monty Hall Problem, the first clip, next week): HTML

2 T 9/8 Classical Probability: Basic definitions; Set theory (e.g., Venn Diagrams) as a model for Sample Spaces and Events; Geometric Probability (uncountable sample spaces). Sections 1.1 - 1.4    
3 R 9/10 Axioms and theorems of probability; Geometric Probability (uncountable sample spaces);

Section 1.5 -- We will cover this in detail; pay particular attention to last two pages, on uncountable sample spaces!

Section 1.6 -- We basically covered this in class today, but review it!


Solution: HTML

  M 9/14 Lab 01: Generating random numbers and running simulations    

Lab 01: HTML

Solution: TXT

4 T 9/15

The general Inclusion/Exclusion Principle; Combinatorics and counting finite sets; Multiplication principle and tree diagrams; The Monty Hall Problem (application of the Multiplication Principle); Choosing with and without replacement; Permutations.

Read this short article on the Inclusion-Exclusion Principle: HTML and then read page 48 (skipping the proof if you like....) in the textbook.

Sections 1.7 (main reading)

Also Google "probability tree diagram" and read the first link (easy) and then read sections 14.1 and 14.2 of the following analysis of the "Monty Hall Problem": PDF

5 R 9/17

Counting principles continued; permutations and combinations; accounting for duplicates; applications to classical probability.

Section 1.8


Solution: HTML

  M 9/21 Lab 02: Computing with permutations and combinations; simulation continued. Read about Pascal's Triangle and its relationship to C(N,K) here.  

Lab 02: HTML

Solution: TXT

6 T 9/22 Counting concluded: combinations and subsets; accounting for repetitions; multinomial coefficients;

Section 1.8,

Read this link on Permutations with Repetitions;

Section 1.9

7 R 9/24 Discrete non-equiprobably sample spaces; Histograms vs distributions; Conditional Probability

Read about Histograms here:

Section 2.1



Solution: HTML

  M 9/28 Lab 03: Drawing probabilities in Python: PMFs and CDFs    

Lab 03: HTML

Solution: PDF

8 T 9/29 Conditional Probability; Independence Section 2.2 Quiz 01 at end of class;  
9 R 10/1 Bayes Theorem, Discrete Random Variables; Distributions, PMFs and CDFs

Read about Bayes Theorem here: HTML

Just read the first two screenfuls or so, with the background, statement of the theorem and the example about Addison.

Read chapter 3.1 up to page 98 on Discrete Random Variables. This is just a way of formalizing what we have already been doing!


Solution: HTML

  M 10/5 Lab 04: Introduction to Pandas and data analysis    

Lab 04: HTML

Solution: TXT

10 T 10/6 Multiple Random Variables on same sample space; Properties of random variables: mode, expected value/mean

Our textbook spreads out the various characteristics of random variables, and I would rather you read Chapter Five from the Schaum's, which I provide since not all of you will have bought it: PDF

This reading uses f(x) instead of p(x) for the PMF, but otherwise it is fairly consistent with what we have been doing.

11 R 10/8

Properties of random variables: moments; variance and standard deviation, skewness, kurtosis.

Uniform Distributions, Bernoulli Trials, Binomial Distribution

Finish reading Schaum's on properties of random variables.

Read pp. 97-99 in DeGroot, or look at the StatTrek page on Binomial: HTML


Solution: HTML

  T 10/13 Monday Schedule; Lab 05    

Lab 05: HTML

Solution: PDF

12 R 10/15

Characteristics of the Binomial: Mean, Mode, Variance, Standard Deviation;

Binomial Experiments; Why is the binomial so important?

Generalizations of the Binomial: Multinomial, Hypergeometric Distributions

Read about Multinomial and Hypergeometric Distributions in sections 5.9 and 5.3 of the textbook; just read to get the basic idea (these are generalizations of the binomial)


OR, read pages 193-4 of the Schaum's Outline (easier) here: PDF


Solution: HTML

  M 10/19 Lab 06: Fitting distributions to data; Generating random variates by simulation.    

Lab 06: HTML

Solution: PDF

13 T 10/20 Other discrete distributions: Poisson

Read about Poisson in Section 5.4 of the textbook; this whole section is very good, but you should skip all the proofs, and skip Theorems 5.4.3 and 5.4.4.

The Wiki page on the Poisson is also good: HTML, read up through the Properties section for mean, variance, & mode.

Quiz 02 Solution: PDF  
14 R 10/22

Conclusions on discrete distributions: Poisson concluded; the Geometric distribution; the Negative Binomial distribution.

Continuous random variables: Uniform distribution, why CDFs are important; integrals in place of summations

Read about the Negative Binomial and the Geometric in 5.5; we will use the formula at the top of p.298 for the PMF; skip Theorems 5.5.2 and 5.5.3 but look carefully at Theorem 5.5.5 (the Memoryless Property of Geometric).

You can also look at these in the summary of distributions which I have linked at the top of the page.



Solution: HTML


  M 10/26 Lab 07: Generating random variates by the inverse method; approximating the Binomial with the Poisson    

Lab 07: HTML

Solution: PDF

15 T 10/27

Continuous random variables: Uniform distribution, why CDFs are important; integrals in place of summations

Normal Distribution

Remind yourself about the properties of continuous distributions by looking through section 3.2.

The wiki page on continuous distributions is also useful: HTML

If you forget how integrals work, you might want to glance over the Wiki article: HTML

Read pp.302-303 on the motivation and definition of the Normal Distribution (Def 5.6.1), and look at Definition 5.6.2 on the Standard Normal Distribution.

The Wiki page on the normal distribution is excellent: HTML

PPT Lecture on Normal Distribution: PDF

16 R 10/29 Approximating the binomial with the Normal distribution; Exponential Distribution; relationship between Exponential and Poisson.

Read about the normal approximation to the binomial here: HTML

On Exponential: Read pp.321-4; also see the Wiki Page: HTML up through "Memoryless Property" and also read "Occurrences of Events"


Solution: HTML

  M 11/2 No Lab!    

  T 11/3 Conclusions on Distributions; Theoretical results about distributions: Cheveshev's Inequality; Law of Large Numbers; Central Limit Theorem  



17 R 11/5 Midterm   Midterm Solution: PDF  
  M 11/9 No lab!    

18 T 11/10 Joint random variables; Independent RVs

Please look at the section "Discrete Joint Distributions" in 3.4 and then read 3.5 on Marginal Distributions. 

As usual, the Wiki article on this is very good:


19 R 11/12 Covariance, Correlation, Autocorrelation

Read section 4.6 in the textbook (again, skipping all the proofs, and understanding that we are only really considering the discrete case)

The Schaum's Outline has a brief treatment of these topics starting on page 129 here: PDF

And, again, the Wiki article is useful: HTML



Solution: HTML

  M 11/16 Lab 08: Computing Covariance and Correlation; Correlation in data sets    

Lab 08: HTML

Solution: TXT

20 T 11/17 Regression and Curve Fitting

The Schaum's treatment of these topics in Appendix A6, starting on p.257, is fairly good: PDF

The Wiki article on "Simple Linear Regression" is also good: HTML


21 R 11/19 Guest Lecture by Steve Homer: Probabilistic Algorithms  


  M 11/23 Lab 09: Displaying Joint Distributions    

Lab 09: HTML

Solution: TXT

22 T 11/24 Statistics: Sampling Theory and overview      
23 R 11/26 No Class: Thanksgiving Break  


  M 11/30 Lab 10: Topic TBD    

Lab 10: HTML

Solution: TXT

23 T 12/1 Statistics: Statistical Inference      
24 R 12/3 Statistics: Statistical Inference  


Solution: HTML

  M 12/7 Lab 11: Topic TBD    

Final Project: HTML

25 T 12/8 Statistics: Statistical Inference      
26 R 12/10 Statistics: Statistical Inference; Wrapup and comments on final exam  


    Final Project due      
  R 12/17 Final Exam 12:30 - 2:30pm