There is only one part to this problem set.
All problems are due by 11:59 p.m. on Sunday, April 13, 2025.
In your work on this assignment, make sure to abide by the collaboration policies of the course.
Don’t forget to use docstrings and to take whatever other steps are needed to create readable code.
If you have questions while working on this assignment, please
come to office hours, post them on Piazza, or email
cs111-staff@cs.bu.edu
.
Make sure to submit your work on Gradescope, following the procedures found at the end of the assignment.
Create a subfolder called ps8
within your
cs111
folder, and put all of the files for this assignment in that
folder.
Date
class50 points; individual-only
Some people have an extraordinary talent to compute (in their heads) the day of the week that any past date fell on. For example, if you tell them that you were born on October 21, 2000, they’ll be able to tell you that you were born on a Saturday!
In this problem, you will create a Date
class, from which you will
be able to create Date
objects that represent a day, month, and
year. You will add functionality to this class that will enable Date
objects to find the day of the week to which they correspond.
Getting started
Begin by downloading the following file:
ps8pr1.py
Make sure to put the file in your ps8
folder. If your browser
doesn’t allow you to specify where the file should be saved, try
right-clicking on the link above and choosing Save as... or
Save link as..., which should produce a dialog box that allows you
to choose the correct folder for the file.
Open that file in Spyder, and you’ll see that we’ve given you the following methods to start:
The __repr__(self)
method, which returns a string
representation of a Date
object. This method will be called when
an object of type Date
is printed. It can also be tested by
simply evaluating an object from the console. This method formats the
month, day, and year that represent a Date
object into a string
of the form 'mm/dd/yyyy'
and returns it.
The is_leap_year(self)
method, which returns True
if the called
object is in a leap year, and False
otherwise. In other
words, when we create a Date
object and call its is_leap_year
method, the method will return whether that specific Date
object
falls in a leap year.
The days_in_month(self)
method, which returns the number of days
in the called object’s month. In other words, when we create a
Date
object and call its days_in_month
method, the method will
return the number of days in that Date
‘s month.
The copy(self)
method, which returns a newly-constructed
object of type Date
with the same month, day, and year that the
called object has. This allows us to create deep copies of
Date
objects.
The day_name(self)
method, which returns a string indicating
day of the week on which the called Date
object falls:
'Monday'
, 'Tuesday'
, 'Wednesday'
, 'Thursday'
, 'Friday'
,
'Saturday'
, or 'Sunday'
.
Important: The provided day_name
method won’t work correctly
until you have implemented the methods described below.
Your tasks
Below you will add several new methods to the Date
class. Be sure
to thoroughly test your methods for all cases to ensure that they are
correct. In addition, make sure to include a docstring for each
method that you write.
Important
If you add test code to your ps8pr1.py
file, please put it in
one or more separate test functions, which you can then call to do
the testing. Having test functions is not required. However,
you should not have any test code in the global scope (i.e.,
outside of a function).
Implement the Date
constructor – i.e., the __init__
method. We have given you the header for this method, and you
should fill in the body so that it initializes the attributes of
the Date
object (month
, day
, and year
) to the values that
are passed in as parameters.
Don’t forget to use the keyword self
when specifying an
attribute. For example, when initializing the month
attribute,
you should write a statement that looks like this:
self.month = ...
Here is some code you can use to test your constructor:
>>> d1 = Date(4, 7, 2025) >>> d1.month result: 4 >>> d1.day result: 7 >>> d1.year result: 2025
Note that we don’t actually use the name __init__
to call the
method. Rather, we call it by using the name of the class (Date
).
Once you have implemented the constructor, read over the rest of the starter code that we’ve given you. Make sure that you understand how the various methods work.
Try the following interactions in the IPython console to experiment
with the __repr__
, is_leap_year
, and days_in_month
methods:
>>> d1 = Date(4, 7, 2025) # An example of using the __repr__ method. Note that no quotes # are displayed, even though the function returns a string. >>> d1 result: 04/07/2025 # Check if d1 is in a leap year. >>> d1.is_leap_year() result: False # Get the number of days in d1's month. >>> d1.days_in_month() result: 30 # Create another object named d2 >>> d2 = Date(12, 31, 2024) >>> d2 result: 12/31/2024 # Check if d2 is in a leap year. >>> d2.is_leap_year() result: True # Get the number of days in d2's month. >>> d2.days_in_month() result: 31
Next, try the following examples in the IPython console to illustrate
why we will need to override the __eq__
method to change the meaning
of the ==
operator:
>>> d1 = Date(1, 1, 2025) >>> d2 = d1 >>> d3 = d1.copy() # Determine the memory addresses to which the variables refer. >>> id(d1) result: 430542 # Your memory address may differ. >>> id(d2) result: 430542 # d2 is a reference to the same Date that d1 references. >>> id(d3) result: 413488 # d3 is a reference to a different Date in memory. # The == operator tests whether memory addresses are equal. >>> d1 == d2 result: True # d1 and d2 have the same memory address. >>> d1 == d3 result: False # d1 and d3 have different memory addresses.
Write a method advance_one(self)
that changes the
called object so that it represents one calendar day after the
date that it originally represented.
Notes:
This method must not return anything. Instead, it should change the value of one or more variables inside the called object.
Since we are advancing the Date
object by one day, self.day
will change. Depending on what the value of self.day
is,
self.month
and self.year
may also change.
You can use the provided days_in_month()
method to determine
how many days are in the Date
object’s month. Because you
are calling one method from inside another Date
method, you
should use the self
parameter to make the call:
self.days_in_month()
Examples:
>>> d = Date(12, 31, 2024) >>> d result: 12/31/2024 >>> d.advance_one() >>> d result: 01/01/2025 >>> d = Date(2, 28, 2024) >>> d result: 02/28/2024 >>> d.advance_one() >>> d result: 02/29/2024 >>> d.advance_one() >>> d result: 03/01/2024 >>> d.advance_one() >>> d result: 03/02/2024
Write a method __eq__(self, other)
that returns True
if the
called object (self
) and the argument (other
) represent the same
calendar date (i.e., if the have the same values for their day
,
month
, and year
attributes). Otherwise, this method should return
False
.
Recall from lecture that the name __eq__
is a special method
name that allows us to override the ==
operator–replacing the
default version of the operator with our own version. In other
words, when the ==
operator is used with Date
objects, our new
__eq__
method will be invoked!
This method will allow us to use the ==
operator to see if two
Date
objects actually represent the same date by testing whether
their days, months, and years are the same, instead of testing
whether their memory addresses are the same.
After implementing your __eq__
method, try re-executing the
following sequence of statements from Task 0:
>>> d1 = Date(1, 1, 2025) >>> d2 = d1 >>> d3 = d1.copy() # Determine the memory addresses to which the variables refer. >>> id(d1) result: 430542 # Your memory address may differ. >>> id(d2) result: 430542 # d2 is a reference to the same Date that d1 references. >>> id(d3) result: 413488 # d3 is a reference to a different Date in memory. # The new == operator tests whether the fields have the same values. >>> d1 == d2 result: True # Both refer to the same object, so their fields # also have the same values >>> d1 == d3 result: True # These variables refer to different objects, but # their fields have the same values.
Notice that we now get True
when we evaluate d1 == d3
. That’s
because the new __eq__
method compares the internals of the objects
to which d1
and d3
refer, rather than comparing the memory
addresses of the objects.
Write a method is_before(self, other)
that returns True
if
the called object represents a calendar date that occurs before the
calendar date that is represented by other
. If self
and other
represent the same day, or if self
occurs after other
, the method
should return False
.
Note: This method is similar to the __eq__
method that you have
written in that you will need to compare the years, months, and
days to determine whether the called object (self
) comes before
other
.
Examples:
>>> ny = Date(1, 1, 2025) >>> d1 = Date(11, 17, 2024) >>> d2 = Date(4, 7, 2025) >>> tg = Date(11, 28, 2024) >>> ny.is_before(d1) result: False >>> d1.is_before(ny) result: True >>> d1.is_before(d2) result: True >>> d2.is_before(d1) result: False >>> d1.is_before(tg) # same month and year result: True >>> tg.is_before(d1) result: False >>> tg.is_before(tg) # a Date object with itself result: False >>> ny.is_before(d2) # same year, but different months result: True >>> d2.is_before(ny) result: False
Write a method is_after(self, other)
that returns True
if
the called object (self
) represents a calendar date that occurs
after the calendar date that is represented by other
. If self
and other
represent the same day, or if self
occurs before
other
, the method should return False
.
Note: There are two ways of writing this method. You can either
emulate your code for is_before
OR you can think about how
you could call __eq__
(==
) and is_before
to make writing
this method very simple!
Examples:
>>> ny = Date(1, 1, 2025) >>> d1 = Date(11, 17, 2024) >>> d2 = Date(4, 7, 2025) >>> tg = Date(11, 28, 2024) >>> ny.is_after(d1) result: True >>> d1.is_after(ny) result: False >>> d1.is_after(d2) result: False >>> d2.is_after(d1) result: True >>> d1.is_after(tg) # same month and year result: False >>> tg.is_after(d1) result: True >>> tg.is_after(tg) # a Date object with itself result: False >>> ny.is_after(d2) # same year, but different months result: False >>> d2.is_after(ny) result: True
Write a method days_between(self, other)
that returns an integer
that represents the number of days between self
and other
.
Notes:
This method should not change self
or other
during its
execution.
The sign of the return value is important! In particular:
self
and other
represent the same calendar date, this
method must return 0.self
is before other
, this method must return a
negative integer equal to the number of days between the two
dates.self
is after other
, this method must return a
positive integer equal to the number of days between the two
dates.Suggested Approach:
Since this method should not change the original objects, you
should first use the copy
method to create deep copies of
self
and other
.
Then, use is_before
or is_after
to figure out which date
comes first.
You should use the advance_one
method that you have already
written to advance the copy of the earlier of the two dates, and
you should continue advancing that date until you obtain the correct
count of the number of days between the two dates.
Here is some pseudocode for one way of computing the correct
count, assuming that date1
is the copy of the earlier date
and date2
is the copy of the later date:
initialize your counter while date1 != date2: increment your counter advance date1 by one day using advance_one
Feel free to transform this pseudocode into Python code as part of your implementation of this method.
As shown in the pseudocode above, you may find it helpful to
use the !=
operator to test if two Date
objects are not
equal to each other. When used with Date
objects, this
operator will call your __eq__
method to compare the
internals of the objects, and it will return the opposite of
what __eq__
returns.
Once you know how many days separate the two values, you can
again use is_before
or is_after
to figure out whether the
returned result should be positive or negative.
You should not try to subtract years, months, and days between the two dates. This technique is too prone to mistakes.
Examples:
>>> d1 = Date(4, 13, 2025) >>> d2 = Date(5, 9, 2025) # our final exam! >>> d2.days_between(d1) result: 26 >>> d1.days_between(d2) result: -26 >>> d1 # Make sure the original objects did not change. 04/13/2025 >>> d2 05/09/2025 # Here are two that pass over a leap day. >>> d3 = Date(12, 1, 2023) >>> d4 = Date(3, 15, 2024) >>> d4.days_between(d3) result: 105
Once all of the above methods are correctly implemented, it should be
possible to use the provided day_name
method to determine the
day of week of any valid Date
object.
For example:
>>> d = Date(4, 13, 2025) >>> d.day_name() result: 'Sunday' >>> Date(4, 15, 2025).day_name() result: 'Tuesday' >>> Date(1, 1, 2100).day_name() result: 'Friday' >>> Date(7, 4, 1776).day_name() result: 'Thursday'
If one or more of these examples do not return the expected result, that means that you have a bug in one or more of the methods that you wrote. You should test and debug as needed:
days_between
advance_one
is_before
and is_after
__eq__
, since it will be called implicitly when your
days_between
method uses the ==
or !=
operator to compare two Date
objects.25 points; pair-optional
This is the only problem of the assignment that you may complete with a partner. See the rules for working with a partner on pair-optional problems for details about how this type of collaboration must be structured.
In this problem, you will write a program that is capable of generating meaningful text all by itself! You will accomplish this by implementing what is known as a Markov text-generation algorithm.
Background
English is a language with a lot of structure. Words have a tendency
(indeed, an obligation) to appear only in certain
sequences. Grammatical rules specify legal combinations of different
parts of speech. For example, the phrase “The cat climbs the stairs”
obeys a legal word sequence. “Stairs the the climbs cat” does
not. Additionally, semantics (the meaning of a word or sentence)
further limits possible word combinations. “The stairs climbs the
cat” is a perfectly legal sentence, but it doesn’t make much sense and
you are very unlikely to encounter this word ordering in practice.
Even without knowing the formal rules of English or the meaning of English words, we can get an idea of which word combinations are legal simply by looking at well-formed English text and noting the combinations of words that tend to occur in practice. Then, based on our observations, we can generate new sentences by randomly selecting words according to commonly occurring sequences of these words. For example, consider the following text:
I love roses and carnations. I hope I get roses for my birthday.
If we start by selecting the word “I”, we notice that “I” may be followed by “love,” “hope,” and “get” with equal probability in this text. We randomly select one of these words to add to our sentence: “I get.” We can repeat this process with the word “get,” necessarily selecting the word “roses” as the next word. Continuing this process could yield the phrase:
I get roses and carnations.
Note that this is a valid English sentence, but not one that we have seen before. Other novel sentences we might have generated include “I love roses for my birthday.” and “I get roses for my birthday.”
More formally, the process used to generate these sentences is called
a first-order Markov process. A first-order Markov process is a
process in which the state at time t + 1
(i.e., the next word)
depends only on the state at time t
(i.e., the previous word). In a
second-order Markov process, the next word would depend on the two
previous words, and so on. Our example above was a first-order process
because the choice of the next word depended only on the current word.
Your tasks
Implementing a first-order Markov text generator will involve
writing two functions: one to process a file and create a dictionary of
legal word transitions, and another to actually generate the new text.
We will consider words to be different even if they only differ
by capitalization or punctuation. For example, 'spam'
, 'Spam'
, and
'spam!'
will all be considered distinct words.
To start, open a new file in Spyder and save it to your ps8
folder
using the name ps8pr2.py
.
Put all of the code that you write for this problem in that file. Don’t forget to include appropriate comments at the top of the file, and a docstring for each function.
Important
If you add test code to your ps8pr2.py
file, please put it in
one or more separate test functions, which you can then call to do
the testing. Having test functions is not required. However,
you should not have any test code in the global scope (i.e.,
outside of a function).
Write a function create_dictionary(filename)
that takes a
string representing the name of a text file, and that returns a
dictionary of key-value pairs in which:
each key is a word encountered in the text file
the corresponding value is a list of words that follow the key word in the text file.
For example, the dictionary produced for the text “I love roses and carnations. I hope I get roses for my birthday.” would include the following key-value pairs, among others:
'I': ['love', 'hope', 'get'] 'love': ['roses'] 'roses': ['and', 'for'] 'my': ['birthday.'] # as well as others!
Guidelines:
We discussed this function in lecture, and we strongly encourage you to begin with the starter code that you can find in the coursepack.
We also wrote a similar function in Task 3 of this week’s lab, and the solutions to that task are posted on the Labs page.
You should not try to remove the punctuation from the words in the text file.
The keys of the dictionary must include every word in the file
except the sentence-ending words. A sentence-ending word is
defined to be any word whose last character is a period ('.'
),
a question mark ('?'
), or an exclamation point ('!'
).
A sentence-ending word should be included in the lists
associated with the words that it follows (i.e., in the value
parts of the appropriate key-value pairs), but it must not
appear as its own key.
If a word w1
is followed by another word w2
multiple times in
the text file, then w2
must appear multiple times in the list
of words associated with w1
. This will allow you to capture the
frequency with which word combinations appear.
In addition to the words in the file, the dictionary must
include the string $
as a special key referred to as the
sentence-start symbol. This symbol will be used when
choosing the first word in a sentence. In the dictionary, the
list of words associated with the key '$'
must include:
Doing this will ensure that the list of words associated with
'$'
includes all of the words that start a sentence. For
example, the dictionary for the text “I scream. You
scream. We all scream for ice cream.” would include the
following entry for the sentence-start symbol:
'$': ['I', 'You', 'We']
Testing your code
To test your code, download the following file:
sample.txt
Make sure to put the file in your ps8
folder. If your browser
doesn’t allow you to specify where the file should be saved, try
right-clicking on the link above and choosing Save as... or
Save link as..., which should produce a dialog box that allows you
to choose the correct folder for the file.
This sample text file contains the following contents:
A B A. A B C. B A C. C C C.
Once this file is in place, run your ps8pr2.py
in Spyder and test
your function from the console:
>>> word_dict = create_dictionary('sample.txt') >>> word_dict result: {'A': ['B', 'B', 'C.'], 'C': ['C', 'C.'], 'B': ['A.', 'C.', 'A'], '$': ['A', 'A', 'B', 'C']}
The order of the keys–or of the elements within a given key’s
list of values–may not be the same as what you see above, but the
elements of the lists must appear in the quantities shown above
for each of the four keys 'A'
, 'B'
, 'C'
, and '$'
.
Here are some additional files you can use for testing:
edited_mission.txt - an edited version of BU’s mission statement, and the dictionary that we derived from it.
brave.txt - lyrics from the song Brave by Sara Bareilles, and its dictionary.
Here again, the ordering that you obtain for the keys and list elements in the dictionaries may be different. In addition, we have edited the formatting of the dictionaries to make them easier to read.
Write a function generate_text(word_dict, num_words)
that
takes as parameters a dictionary of word transitions (generated by
the create_dictionary
function) named word_dict
and a positive
integer named num_words
. The function must use word_dict
to
generate and print num_words
words, with a space after each
word. The function must print the words that it generates. It
must not return a value.
Guidelines:
We discussed this function in lecture, and we strongly encourage you to begin with the pseudocode provided in the coursepack and turn it into appropriate Python code.
The first word must be chosen randomly from the words associated
with the sentence-start symbol, '$'
. The second word must be
chosen randomly from the list of words associated with the first
word, etc. When the current word ends in a period ('.'
),
question mark ('?'
), or exclamation point ('!'
), the function
must detect this and start a new sentence by again choosing a
random word from among those associated with '$'
.
Do not include '$'
in the output text. It should only be
used as an internal marker for your function.
You must use the random.choice
function to choose from the list
of possible words. Don’t forget to include import random
at the
top of your file. Then, if wordlist
is the list of possible
words at a given point in the generated text, you can do
something like the following:
next_word = random.choice(wordlist)
Here again, you shouldn’t try to remove or change the punctuation associated with the words, and you don’t need to worry if the generated text doesn’t end with appropriate punctuation. The generated text won’t be perfect, but most of the time it will at least be meaningful!
If your function encounters a word that doesn’t have any words associated with it in the dictionary, the function must start a new sentence. This situation can occur if the last word in the file used to create the dictionary was unique and did not end with punctuation.
Examples
Here are two examples using the same text file as above. Your output
may differ because of the randomness involved in the generation.
>>> word_dict = create_dictionary('sample.txt') >>> generate_text(word_dict, 20) B C. C C C. C C C C C C C C C C C. C C C. A >>> generate_text(word_dict, 20) A B A. C C C. B A B C. A C. B A. C C C C C C.
Try some other examples using longer documents containing English
words, such as the works of William Shakespeare.
In particular, we are providing a text file containing the first act of
Romeo and Juliet, along with the files that we provided
above in the examples for create_dictionary
.
Login to Gradescope by clicking the link in the left-hand navigation bar, and click on the box for CS 111.
Submit your ps8pr1.py
and ps8pr2.py
files under the assignment
labeled PS 8 using these steps:
Click on the name of the assignment in the list of assignments. You should see a pop-up window with a box labeled DRAG & DROP. (If you don’t see it, click the Submit or Resubmit button at the bottom of the page.)
Add all four files to the box labeled DRAG & DROP. You can either drag and drop the files from their folder into the box, or you can click on the box itself and browse for the files.
Click the Upload button.
You should see a box saying that your submission was successful.
Click the (x)
button to close that box.
The Autograder will perform some tests on your files. Once it is done, check the results to ensure that the tests were passed. If one or more of the tests did not pass, the name of that test will be in red, and there should be a message describing the failure. Based on those messages, make any necessary changes. Feel free to ask a staff member for help.
Note: You will not see a complete Autograder score when you submit. That is because additional tests for at least some of the problems will be run later, after the final deadline for the submission has passed. For such problems, it is important to realize that passing all of the initial tests does not necessarily mean that you will ultimately get full credit on the problem. You should always run your own tests to convince yourself that the logic of your solutions is correct.
If needed, use the Resubmit button at the bottom of the page to resubmit your work. Important: Every time that you make a submission, you should submit all of the files for that Gradescope assignment, even if some of them have not changed since your last submission.
Near the top of the page, click on the box labeled Code. Then click on the name of each file to view its contents. Check to make sure that the files contain the code that you want us to grade.
Important
It is your responsibility to ensure that the correct version of every file is on Gradescope before the final deadline. We will not accept any file after the submission window for a given assignment has closed, so please check your submissions carefully using the steps outlined above.
If you are unable to access Gradescope and there is enough
time to do so, wait an hour or two and then try again. If you
are unable to submit and it is close to the deadline, email
your homework before the deadline to
cs111-staff@cs.bu.edu
Last updated on April 28, 2025.