Approximate resilience,
Monotonicity, and the
Complexity of agnostic learning
Vitaly Feldman
IBM Research β Almaden
Dana Dachman-Soled
UMD
Li-Yang Tan
Andrew Wan
Karl Wimmer
Simons Inst.
IDA
Duquesne
SODA, 2015
Learning from examples
-+
+
+
- - + +
+
+
Learning algorithm
Labeled examples
Classifier
Agnostic learning [V 84; H 92; KSS 94]
β +
-
-- +
+
+
-- -
+
+ +
Example: π₯, β ~ π
π: distribution over π × {β1, +1}
min Pr [π π₯ β β]
πβπΆ π₯,β βΌπ
Agnostic learning of a class of functions πΆ
with excess error π : β π, output w.h.p. β
Pr [β π₯ β β] β€ Opt π πΆ + π
π₯,β βΌπ
3
Complexity of agnostic learning
β’ Too hard
β Conjunctions over 0,1 π : ππ(
π)
for π = Ξ© 1 [KKMS 05]
β’ β¦ but canβt prove lower bounds
Complexity of agnostic learning
Distribution-specific learning over distribution π·
Marginal of π on π equals to a fixed π·
β’ For uniform distribution π over 0,1
π
β Conjunctions: ππ(log 1/π)
β Halfspaces: ππ(1/π
β Monotone juntas?
4)
[KKMS 05]; ππ(1/π
2)
[FKV 14]
Our characterization
β’ π 1 = Eπ₯βΌπ π π₯
β’ Polπ : polynomials of degree π over π vars
β’ Ξ π, Polπ = min π β π
πβππ
1
THM: For degree π let πΏ =
Ξ π,Polπ
max
2
πβπΆ
Any SQ algorithm for agnostically learning πΆ over
π with excess error < πΏ needs πΞ©(π) time
* πΆ is closed under renaming of variables; π β€
3
π
Polynomial πΏ1 regression [KKMS 05]
Ξ π,Polπ
2
πβπΆ
For degree π let πΏ = max
There exists a SQ algorithm for learning πΆ over π
with excess error πΏ + π in time poly(ππ /π)
Statistical queries [Kearns 93]
π1
π£1
π2
π£2
SQ learning algorithm
ππ‘
π£π‘
π
SQ oracle
ππ : π × {β1,1} β β1,1 ,
π£π β ππ ππ π₯, β
π is tolerance of the query
Complexity π:
β’ at most π queries
β’ each of tolerance at least 1/π
β€Ο
SQ algorithms
β’ PAC/agnostic learning algorithms (except Gaussian elimination)
β’ Convex optimization (Ellipsoid, iterative methods)
β’ Expectation maximization (EM)
β’ SVM (with kernel)
β’ PCA
β’ ICA
β’ ID3
β’ π-means
β’ method of moments
β’ MCMC
β’ Naïve Bayes
β’ Neural Networks (backprop)
β’ Perceptron
β’ Nearest neighbors
β’ Boosting
[K 93, BDMN 05, CKLYBNO 06, FPV 14]
Roadmap
Proof: approximate
resilience
Tools: analysis of
approximate
resilience
Application: monotone
functions
Open problems
Approximate resilience
π is π-resilient if for any degree-π polynomial π
Eπ π π₯ π π₯ = 0
All Fourier coefficients of degree at most π are 0
π is πΌ-approximately π-resilient if exists
π-resilient π: 0,1 π β [β1,1] such that π β π
1
β€πΌ
THM: a Boolean π is πΌ-approximately π-resilient
if and only if
Ξ π, Polπ β₯ 1 β πΌ
From approx. resilience to SQ hardness
πΆ closed under variable renaming β
Exist π = πΞ©(π) functions π1 , β¦ , ππ such that corresponding
π1 , β¦ , ππ are uncorrelated
πβπ
1
β€ πΌ β ππ such that Pr [π π₯ β β] β€ πΌ/2
π₯,β βΌππ
Complexity of any SQ algorithm with error
least π1/3 [BFJKMR 94; F 09]
1
2
Excess error: β
1
π1/3
β
πΌ
2
=
Ξ π,Polπ
2
β
1
π1/3
1
2
β
1
π1/3
is at
Bounds on approximate resilience
1. Convert bounds on πΏ2 distance to unbounded
π-resilient function to πΏ1 distance to [β1,1]
bounded π-resilient function
2. Amplify resilience degree via composition
If π1 is πΌ1 -approximately π1 -resilient
π2 is πΌ2 -approximately π2 -resilient
π1 has low noise sensitivity
Then πΉ π₯1 , π₯2 , β¦ , π₯π = π1 π2 π₯1 , β¦ , π2 π₯π
πΌ3 -approximately (π1 π2 )-resilient
is
Monotone functions of π variables
Known results:
β’
ππ
β’ π
π/π 2
Ξ©(1/π 2 )
[BT 96]
[KKMS 05]
THM: SQ complexity πΞ©
π
for π = 1/2 β π(1)
Show existence of π 1 -approximately Ξ© π -resilient
mon. func.
β’ Lower bounds on Talagrandβs DNF [BBL 98]
Explicit constructions
β’ π 1 -approximately 2
TRIBES: disjunction of
log π
π
log π
-resilient mon. func.
conjunctions of log π variables
+amplification
β’ 2 log π /loglog π -resilient Boolean function π 1 -close to
mon. Boolean func.
CycleRun function [Wieder 12]
+amplification
Conclusions and open problems
β’ Characterization can be extended to general product
distributions over more general domains
β’ Does not hold for non-product distributions [FK14]
β’ Can the lower bound be obtained via a reduction to a
concrete problem? E.g. learning noisy parities
β’ Other techniques for proving bounds on approximate
resilience (or πΏ1 approximation by polynomials)
β’ Complexity of distribution-independent agnostic SQ
learning?
© Copyright 2026 Paperzz