Published on Tue Nov 24 2015

Super-Linear Gate and Super-Quadratic Wire Lower Bounds for Depth-Two and Depth-Three Threshold Circuits

Daniel M. Kane, Ryan Williams

In order to formally understand the power of neural computing, we first need to crack the frontier of threshold circuits with two and three layers, a regime that has been surprisingly intractable to analyze. The key is a new random restriction lemma for linear threshold functions. We present a new function in based on small-biased sets, which we prove cannot be computed.

0
0
0
Abstract

In order to formally understand the power of neural computing, we first need to crack the frontier of threshold circuits with two and three layers, a regime that has been surprisingly intractable to analyze. We prove the first super-linear gate lower bounds and the first super-quadratic wire lower bounds for depth-two linear threshold circuits with arbitrary weights, and depth-three majority circuits computing an explicit function. We prove that for all , the linear-time computable Andreev's function cannot be computed on a -fraction of -bit inputs by depth-two linear threshold circuits of gates, nor can it be computed with wires. This establishes an average-case ``size hierarchy'' for threshold circuits, as Andreev's function is computable by uniform depth-two circuits of linear threshold gates, and by uniform depth-three circuits of majority gates. We present a new function in based on small-biased sets, which we prove cannot be computed by a majority vote of depth-two linear threshold circuits with gates, nor with wires. We give tight average-case (gate and wire) complexity results for computing PARITY with depth-two threshold circuits; the answer turns out to be the same as for depth-two majority circuits. The key is a new random restriction lemma for linear threshold functions. Our main analytical tool is the Littlewood-Offord Lemma from additive combinatorics.

Sun May 31 2020
Machine Learning
Neural Networks with Small Weights and Depth-Separation Barriers
In studying the expressiveness of neural networks, an important question is whether there are functions which can only be approximated by sufficiently deep networks. For constant depths, existing results are limited to depths and , and achieving results for wealthier depths has been an important open question.
0
0
0
Sun Feb 14 2016
Neural Networks
Benefits of depth in neural networks
For any positive integer , there exist neural networks with nodes per layer. This result is proved here for a class of nodes termed "semi-algebraic gates"
0
0
0
Sat Jan 30 2021
Machine Learning
Size and Depth Separation in Approximating Benign Functions with Neural Networks
Not all functions of interest usually have a polynomially-bounded Lipschitz constant. We explore the benefits of size and depth for approximation of benign functions with ReLU networks. Proving existence of a benign function that cannot be approximated by polynomial-size networks of would settle longstanding open problems.
0
0
0
Wed Nov 08 2017
Neural Networks
Lower bounds over Boolean inputs for deep neural networks with ReLU gates
Study of high depth networks using ReLU gates. We try to understand the role of depth in such neural networks. We show size lowerbounds against such network architectures hitherto unexplored. We also show that there exists a Sum-of-ReLU-of -ReLU function which neural nets can’t represent.
0
0
0
Mon Feb 26 2018
Neural Networks
Limits on representing Boolean functions by linear combinations of simple functions: thresholds, ReLUs, and low-degree polynomials
We consider the problem of representing Boolean functions exactly by "sparse"linear combinations. We provide generic tools for proving lower bounds on representations of this kind. Applying these, we give several new lower bounds for "semi-explicit"Boolean functions.
0
0
0
Mon Feb 27 2017
Machine Learning
Depth Separation for Neural Networks
We give a simple proof that shows that poly-size depth two neural networks with (exponentially) bounded weights cannot approximate whenever cannot be approximated by a low degree polynomial. The result holds w.r.t.\ the uniform distribution on $f:\
0
0
0