Published on Tue Jul 30 2019

Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity

Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Zeroth-order methods powerful optimization tools for solving many machine learning problems. They only need function values (not gradient) in the optimization. We propose a class of faster zerOTH-order stochastic alternating direction method of multipliers (ADMM) methods (ZO-SPIDER-ADMM), to solve the nonconvex finite-sum problems.

0
0
0
Abstract

Zeroth-order methods powerful optimization tools for solving many machine learning problems because it only need function values (not gradient) in the optimization. Recently, although many zeroth-order methods have been developed, these approaches still have two main drawbacks: 1) high function query complexity; 2) not being well suitable for solving the problems with complex penalties and constraints. To address these challenging drawbacks, in this paper, we propose a class of faster zeroth-order stochastic alternating direction method of multipliers (ADMM) methods (ZO-SPIDER-ADMM) to solve the nonconvex finite-sum problems with multiple nonsmooth penalties. Moreover, we prove that the ZO-SPIDER-ADMM methods can achieve a lower function query complexity of $O(nd+dn^{\frac{1}{2}}\epsilon^{-1})$ for finding an $\epsilon$-stationary point, which improves the existing best nonconvex zeroth-order ADMM methods by a factor of $O(d^{\frac{1}{3}}n^{\frac{1}{6}})$, where $n$ and $d$ denote the sample size and dimension of data, respectively. At the same time, we propose a class of faster zeroth-order online ADMM methods (ZOO-ADMM+) to solve the nonconvex online problems with multiple nonsmooth penalties. We also prove that the proposed ZOO-ADMM+ methods can achieve a lower function query complexity of $O(d\epsilon^{-\frac{3}{2}})$, which improves the existing best result by a factor of $O(\epsilon^{-\frac{1}{2}})$. Extensive experimental results on the structure adversarial attack on black-box deep neural networks demonstrate the efficiency of our new algorithms.