On the Hardness of Learning under Symmetries

Abstract

We study the problem of learning equivariant neural networks via gradient descent. A recent line of learning theoretic research has demonstrated that learning shallow, fully-connected (i.e. non-symmetric) networks has exponential complexity in the correlational statistical query (CSQ) model, a framework encompassing gradient descent. We ask: are known problem symmetries sufficient to alleviate the fundamental hardness of learning neural nets with gradient descent? We answer this question in the negative. In particular, we give lower bounds for shallow graph neural networks, convolutional networks, invariant polynomials, and frame-averaged networks for permutation subgroups, which all scale either superpolynomially or exponentially in the relevant input dimension. Therefore, learning the complete classes of functions represented by equivariant neural networks via gradient descent remains hard.

Publication
International Conference on Learning Representations (spotlight, top 5% of submissions)