Machine Problem (due 11:59PM, April 26): Goal: You are expected to apply dual theory and KKT conditions to solve an optimization problem in a closed form; you will also apply the subgradient projection method to implement an iterative optimization problem which does not admit closedform solution; in case of large-scale problem where you cannot load the entire dataset in physical memory, you will learn to use incremental subgradient method to solve the problem with one data point in memory each time. Background (linear regression problem): Suppose we are given a set of n examples in p-dimensional feature space π = [π₯1 , β― π₯π ] β π π×π , and their corresponding responses π = [π¦1 , β― π¦π ] β π π×π of dimension q. You want to use a linear regression model to reveal the relations between xi and yi, for i=1, β¦, n. In other words, you need to find a transformation matrix π β π π×π , so that the least square errors can be minimized to find the optimal solution to W, 2 min πΉ(π) = ||π β ππ||πΉ W To improve the robustness, a proper regularizer on W can be used to constrain the model in addition to the above objective function. The most useful regularizers include l1 and l2 norms. To make the problem interesting, we focus on l1 norm instead. Then we have the following optimization problem: 2 min πΉ(π) = ||π β ππ||πΉ W π . π‘. , ||π||1 β€ π where c is a preset parameter which controls how sparse the transformation matrices would be. Problems: Part 1 (Projection to the constraint set). Using dual problem (by forming Lagrangian function) and KKT condition to derive the projection of a point W0 onto the convex constrain set ||π||1 β€ π. That is to solve the following optimization problem: min||π β π0 || 2 π π . π‘. , ||π||1 β€ π Note that this optimization problem can be solved in a closed form. Part 2 (subgradient method). Randomly generate a set of data example X, and the target response values Y by a linear model with some noise added. For example, suppose you know W, then generate Y as ππ + π where π is Gaussian noises. Then apply subgradient projection method to solve the l1-regularized least square problem with different c. You will need to use the projection derived in Part 1. Compare the obtained W with the groundtruth W you use to generate the data. In this part, we can try different methods to specify the stepsize used in subgradient projection method β constant stepsize, varying stepsize, and compare the convergence. Part 3 (incremental subgraident method). When the number of examples n is very large, it is computationally demanding to compute the subgradient of the objective function in each iteration to update W. In this case, we can decompose the objective function F(W) into a sum of terms each on a single data. Then we can apply the incremental subgradient projection methods to solve the problem. Another advantage of this method is when the data is too large to fit in the physical memory, you only need to upload a single data point in the memory and compute the corresponding subgradient. This makes the algorithm scalable to solving very large scale optimization problem. What to submit: Part 1, you are supposed to submit how you derive the solution in hard copy. Part 2/3: you need to submit your source code as well as a brief report on your results (e.g., the curves showing the progress of the optimization algorithm over iterations, and the accuracy you can achieve finally with different c). Also compare the convergence rate with the theoretical result we derive in the class. Note that you can implement the algorithm in whatever programming languages you choose (e.g, Matlab, Phython, Java, C/C++, C#). Send your source code and report to [email protected]. Timeline: You are given four weeks to submit the result. For part 1, if you can submit your derivation in the first two weeks, you will get bonus. In two weeks, I will refer you to a paper where you can find the solution (It is OK if you can dig the paper out, learn it and re-derive the solution in your own language in the first two weeks. You still can get the bonus. But do not ask me which paper it is in the first two weeks. Note that finding proper references is very important skill in the research. Thatβs why we call it re-search. )
© Copyright 2026 Paperzz