Homework 7
Table of Contents
The following assignment is due Thursday 10/31 by 11:59 PM. You should submit all your written solutions to Gradescope as a single pdf. Follow the instructions in the programming problem for code solutions.
- Your solutions must be exceptionally neat and the final answer in your solution to each problem must be abundantly clear, e.g., surrounded in a very visible box. The graders have license to dock points for illegible or unclear solutions.
- For the written part, choose the correct pages corresponding to each problem in Gradescope. Note that Gradescope registers your submission as soon as you submit it, so you don't need to rush to choose the pages. You will receive no credit if you do not choose the correct pages, no exceptions.
1. Stochastic Matrices
For each of the following matrices:
- Draw the state diagram for \(A\). You should label each node with a positive integer for the column of \(A\) which describes its edges.
- Determine if \(A\) is regular. If it is write down the smallest exponent \(k\) such that \(A^k\) has strictly positive values. If it is not, then justify your answer.
- Determine the general form solution of the equation \((A - I)\mathbf x = \mathbf 0\). You may use a computer to do this. You answer should be expressed in fractions, not decimals.
- Determine a steady state vector for \(A\). If the steady state vector is you unique, write the steady state is unique. You must do this by hand and show your work.
1.1. (5 points)
1.2. (5 points)
1.3. (5 points)
1.4. (5 points)
1.5. (5 points)
1.6. (5 points)
2. Market Prediction
(6 points) Suppose you've been watching the stock price of your favorite company and you've discovered the following trends based on the data you've taken:
- If the price is STEADY there's a 70% chance it will remain steady the next day, but a 20% chance it will go HIGH and a 10% chance it will dip LOW.
- If the price is HIGH there's a 55% chance it will remain high the next day, but a 40% change it will go back to being STEADY, and just a 5% chance it will dip all the way down to LOW.
- If the price is LOW, there is a 60% change it will remain low the next day, but a 30% chance it will reset to STEADY, and a 10% chance it will suddenly jump to HIGH.
Suppose that the price is currently STEADY. In the long term, is more likely to remain STEADY, go up HIGH, or dip down LOW? Justify your answer. You may use a computer to solve any systems you need to solve, and you must present your final answer in fractions, not decimals.
3. LU Factorizations
For each of the following matrices, determine its LU factorization by hand. You must show your work.
3.1. (2 points)
3.2. (4 points)
3.3. (4 points)
Without doing any row operations, determine the LU factorization for the matrix obtained by applying the row operations
\begin{align*} R_3 &\gets R_3 - 2R_2 \\ R_3 &\gets R_3 + R_1 \\ R_2 &\gets R_2 + R_1 \\ \end{align*}to the matrix
\begin{bmatrix} 2 & 8 & -4 & 1 \\ 0 & 0 & 3 & 2 \\ 0 & 0 & 0 & -7 \end{bmatrix}Justify your answer.
4. Sparse Factorization
In this problem, you will use the provided code in sparse.py
to
benchmark LU factorization and matrix inversion.
4.1. (2 points)
The provided code includes a function test_matrix
which
builds a matrix given a positive integer parameter. Write down the
matrix test_matrix(4)
.
4.2. (4 points)
Determine the number of \(\textcolor{red}{\textit{nonzero}}\) entries in
the inverse of test_matrix(10 ** 3)
. Then use scipy.linalg.lu
to
determine the LU factorization of test_matrix(10 ** 3)
and determine
the number of entries in \(L\) and \(U\). Read the linked documentation
on scipy.linalg.lu
for more details (you can ignore the permutation matrix
in the given factorization). Report these three entry-counts in your
solution.
4.3. (4 points)
Uncomment the code at the end of the file sparse.py
and run
it. Write down the printed values and discuss what they mean. In
particular, is the amount of time that it takes to factor versus
invert consistent with our discussion in lecture? (If you are running
on a fast computer you may have to change \(n\) to \(15^3\) to see the
difference).
5. ChatMarkov (Programming)
(12 points) Large language models like GPT-4 use fancy techniques to predict what word or sequence of words should follow a given piece of text. It’s possible to build a very unfancy version of this using random walks. Given a corpus of text, we can build a state diagram whose nodes are words, and whose edges indicate word adjacency in the corpus weighted proportionally to the number of occurrences of that adjacent pair. So words which tend to be next to each other will have larger weights on their edges.
As an example, suppose we used the statement
a dog and a cat and a bird
as our corpus. We can build a state diagram with nodes for each word
V = ["a", "dog", "and", "cat", "bird"]
and with edges for each pair of adjacent words
E = [("a", "dog"), ("dog", "and"), ("and", "a"), ("a", "cat"), ...] W = [1/3 , 1 , 1 , 1/3 , ...]
weighted proportionally to the number of adjacencies relative to other adjacencies with the same starting word.
Rather than using a single statement for our corpus, we’ll use the entirety of Shakespeare’s sonnets to generate new (nonsense) sonnets by taking a random walk on this graph and collecting words along the way.
The process is roughly as follows:
- Read in the text of Shakespeare’s sonnets and build the stochastic matrix for the state machine described above.
- Perform a random walk on this matrix, keeping track of the nodes you’ve visited so far.
- Use that list of nodes to generate a list of words which will make up the poem. Then format that poem so that it can be nicely printed.
You are given starter code in the file hw07.py
. Don’t change the
name of this file when you submit. Also don’t change any of the names
of functions included in the starter code. The only changes you should
make are to fill in the TODO items in the starter code. Most of the
above process in implemented for you. All together you have to fill in
two functions:
to_stochastic
which converts a list of word pairs into a stochastic matrix as described above (Hint. Make sure to look at and understandnumpy.sum
).random_walk
which performs a random walk of a given length and returns the list of nodes visited. There is already an implementation of a single random step in the functionrandom_step
. You should use this function and collect the outputs into a list.
See the docstrings in the starter code for more details. Once you've
completed this, you should be able to run your code using python3
hw07.py
and it should print a (unhinged) poem for you:
from fairest wights, and so thine own, in one of my sinful then, and suffer dearth, painting thy shadow’s form another; whose worth’s unknown, although their treasure. so oft when clouds to trust; enjoy’d no name, blesses an older friend, you most precious friends possess’d, desiring this more. then, if thou leave? thy lovers gone, save what they see thy mind. these poor beauty of nature’s riches from those errors hath her pipe in my soul of faults should despair, which shall stay. no shape so suited, and straight will...
You will upload a single file hw07.py
. You will also be provided
with a text file sonnets.txt
which contains the text of the entirety
of Shakespeare’s sonnets. You do not need to upload this when you
submit, you just need to upload hw07.py
.