Homework 1 (60pts)

Homework 1 (60pts)
CMSC 421 - Intro to Artificial Intelligence
Due February 8th, 11:59PM
Guidelines for all Homeworks
Homeworks should be uploaded to the ELMS server for the class under the
‘HW1’ project, in PDF format, with the filename “Firstname Lastname HW1.pdf”.
Additionally, put your name near the top of the first page. Typed submissions
are strongly preferred, but if you wish to write out part or all of your homework
by hand, you may - simply scan your paper or take a (legible) photo of it and
add that to the PDF file (you should still only submit one file, and make sure
to write neatly!). If you need help with this, see an instructor.
Here are the penalties for not complying with these guidelines:
• -3 points for a submission not in PDF format (more if we can’t read it).
• -3 points for an in-person (non-digital) submission.
• -2 points for an incorrect file name.
• -2 points for not putting your name on the first page.
• -30% if turned in late within 24 hours of the deadline.
• No credit if turned in late more than 24 hours after the deadline.
1
A. Math Review
In the following, show your work as you would in a Calculus course.
1. The sigmoid function (also sometimes called the standard logistic function,
when sigmoid function is used to refer to a larger class of similarly-shaped
functions) is given by
1
.
σ(x) =
1 + e−x
It is often used for the activation function in neural networks.
(a) What are the upper and lower bounds of this function?
(b) Compute the derivative of this function.
(c) Express the derivative you computed in part (b) in terms of σ itself.
(d) Show that σ(−x) = 1 − sigma(x)
2. The Heaviside function is a basic step function, defined by
(
0, x ≤ 0;
H(x) =
1, x > 0.
(a) Draw a graph of the Heaviside function.
(b) The integral of the Heaviside function is known as the rectifier. Compute this integral, and express it as a piecewise function.
(c) The rectifier is alternatively known as the ramp function. Draw a
graph of this function to see why.
3. A common smooth approximation of the rectifier function is known as the
softplus function. It is given by sof tplus(x) = ln(1 + ex ).
(a) Compute the derivative of the softplus function.
(b) Show that this derivative is equal to another function mentioned on
this page.
4. A common task for AI to solve is optimization - the minimization or
maximization of some value or set of values.
(a) Given the equation x2 −2y+43 = 0, find the pair (x, y) that minimizes
y.
(b) Given the same equation, find the pair (x, y) that minimizes x + y.
5. The gradient is one way to extend derivatives into the multi-variable realm,
and is used in machine learning techniques such as Stochastic Gradient
Descent (SGD). Determine the gradient (with respect to x and y) of the
following multivariate function, and then evaluate it at the point (2,2):
f (x, y) = 3sinx − y 2 + 4xy
2
6. The vector dot product of two same-sized vectors ~u and ~v is equal to the
sum of the pairwise products of corresponding entries in ~u and ~v . In other
words, if ~u = {u1 , u2 , ...} and ~v = {v1 , v2 , ...}, the dot product ~u ·~v is equal
to
n
X
ui vi .
i=1
Compute the dot product of the two vectors {1,2,3} and {4,5,6}.
7. The product of an m × n matrix A and an n × r matrix B is equal to
A ∗ B, where the entry in row x column y is the dot product of row x from
matrix A with column y from matrix B. Compute both products A ∗ B
and B ∗ A for the following two 2 × 2 square matrices, clearly indicating
which product is which:
A=
1
4
2
5
B=
3
2
7
3
8. The transpose of a matrix A is the matrix AT , which consists of all the
same entries as A but with the rows and columns interchanged (i.e. the
first row becomes the first column, etc.), such that ATi,j = Aj,i
(a) Determine the transpose of the following matrix:
1
4
2
5
3
6
(b) Given two arbitrary matrices A and B of dimensions m×n and n×p,
respectively, prove that (A ∗ B)T = (B T ) ∗ (AT ). That is, prove that
the transpose of the product of two matrices is equal to the product
of their transposes in reverse order.
9. Mathematical induction is an important technique for proving mathematical statements, in which you break up a statement into the base case and
the inductive case. The base case (case 1) is your starting point, and is
often simple to prove. Then, the inductive case reads “given that this
statement holds in cases 1 through n, then it must hold in case n + 1”.
If the set of all possible cases is countable (can be mapped 1-1 with the
integers), and we order them as “case 1, case 2, case 3...” such that any
particular case will be covered after a finite number of iterations, then
proving these two smaller statements suffices to prove the original statement.
Using mathematical induction, prove that the sum of the first n positive
integers is always equal to
n(n + 1)
.
2
3
10. Fill in a truth table like the one below (you may print it out, or just draw
your own) for the propositional logic expression P ∨ (Q ∧ R).
P Q R P ∨ (Q ∧ R)
4