Conference on Learning Theory Hristijan Jankulovski 61360413 Algorithms April 4, 2017 Conference summary The Conference on Learning Theory (COLT) is addressing theoretical aspects of machine learning and related topics. They strongly support a broad definition of learning theory, including, but not limited to: ”Design and analysis of learning algorithms”, ”Statistical and computational complexity of learning”, ”Optimization models and algorithms for learning”, ”Unsupervised, semi-supervised, and active learning”, ”Online learning”, ”Artificial neural networks, including deep learning”, ”Learning with large-scale datasets”, ”Decision making under uncertainty”, ”Bayesian methods in learning”, ”High dimensional and non-parametric statistical inference”, ”Planning and control, including reinforcement learning”, ”Learning with additional constraints: e.g. privacy, memory or communication budget”, ”Learning in other settings: e.g. social, economic, and game-theoretic” and ”Analysis and applications of learning theory in related fields: natural language processing, neuroscience, bioinformatics, privacy and security, machine vision, information retrieval”. The proceedings contain the 63 papers accepted to and presented at the 29th Conference on Learning Theory (COLT), held in New York, USA on June 23-26, 2016. These papers were selected by the program committee with additional help from external expert reviewers from 199 submissions. Papers review Online Sparse Linear Regression In this paper, they model the problem of prediction with limited access to features in the most natural and basic manner as an online sparse linear regression problem. In this problem, a learner makes predictions for the labels of examples arriving sequentially over a number of rounds. Each example has d features that can be potentially accessed by the learner. However, in each round, the learner is restricted to choosing an arbitrary subset of features of size at most k. The learner then acquires the values of the subset of features, and then makes its prediction, at which point the true label of the example is revealed to the learner. The learner suffers a loss for making an incorrect prediction. The goal of the learner is to make predictions with total loss comparable to the loss of the best sparse linear regressor. To measure the performance of the online learner, they use the standard notion of regret, which is the difference between the total loss of the online learner and the total loss of the best sparse linear regressor. Thus the goal is to minimize regret with respect to the best sparse linear regressor, where prediction accuracy is measured by square loss. For this problem, they give an inefficient algorithm that obtains regret 1 √ e T ) after T prediction rounds. They complement this result by showing bounded by O( that no algorithm running in polynomial time per iteration can achieve regret bounded by O(T 1−δ ) for any constant δ > 0 unless NP ⊆ BPP. This hardness result holds even if the algorithm is allowed to access more features than the best sparse linear regressor up to a logarithmic factor in the dimension. Online Isotonic Regression Similarly to the previous paper, this paper consider the online version of the isotonic regression problem. A problem of sequential prediction in the class of isotonic functions is proposed. A learner is given a set of T linearly ordered points, over the course of T trials, the adversary picks a new (as of yet unlabeled) point and the learner predicts a label from [0, 1] for that point. Then, the true label (also from [0, 1]) is revealed, and the learner suffers the squared error loss. After T rounds the learner is evaluated by means of the regret, which is its total squared loss minus the loss of the best isotonic function in hindsight. They survey several standard online learning algorithms and show that none of them achieve the optimal regret exponent; in fact, most of them (including Online Gradient Descent, Follow the Leader and Exponential Weights) incur linear regret. They then prove that the Exponential Weights algorithm played over a covering net of isotonic functions has a regret bounded by O(T 1/3 log 2/3 (T )) and present a matching Ω(T 1/3 ) lower bound on regret. They provide a computationally efficient version of this algorithm. They also analyze the noise-free case, in which the revealed labels are isotonic, and show that the bound can be improved to O(log T ) or even to O(1). An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits Unlike the first two papers, in this paper a multi-armed bandit problem is presented, which is the most basic example of a sequential decision problem with an exploration-exploitation trade-off. Specifically, in an n-armed bandit problem, there are n different options that can be chosen, or n different arms that can be pulled, on each trial; each of these arms, when pulled, generates some reward (or loss). On each trial, the learner selects one arm to pull, and observes the reward associated with only the pulled arm; the rewards associated with the other arms remain hidden from the learner (it is in this sense that the learner receives only partial information). The goal of the learner is accumulate as much reward as possible over a sequence of trials, e.g. compared to the best fixed arm in hindsight. In practice, it is difficult to work with the expected regret due to expensive computability, therefore an simplified version of the expected regret is used, pseudo-regret, which is upper bounded by the expected regret. They present an algorithm that achieves almost optimal pseudo-regret bounds against√adversarial and stochastic bandits. Against adversarial bandits P the pseudo-regret is O(K n log n) and against stochastic bandits the pseudo-regret is O( (log n)/∆i ). i 2
© Copyright 2026 Paperzz