An Effective Hard Thresholding Method for Nonconvex Sparse Learning

An Effective Hard Thresholding Method Based on Stochastic Variance Reduction for Nonconvex Sparse Learning


We propose a hard thresholding method based on stochastically controlled stochastic gradients (SCSG-HT) to solve a family of sparsity-constrained empirical risk minimization problems. The SCSG-HT uses batch gradients where batch size is pre-determined by the desirable precision tolerance rather than full gradients to reduce the variance in stochastic gradients. It also employs the geometric distribution to determine the number of loops per epoch. We prove that, similar to the latest methods based on stochastic gradient descent or stochastic variance reduction methods, SCSG-HT enjoys a linear convergence rate. However, SCSG-HT now has a strong guarantee to recover the optimal sparse estimator. The computational complexity of SCSG-HT is independent of sample size n when n is larger than 1 ϵ, which enhances the scalability to massive-scale problems. Empirical results demonstrate that SCSG-HT outperforms several competitors and decreases the objective value the most with the same computational costs.

Click here to download the software package.

The related paper was published in AAAI2020.

This is an open source program for non-commercial use only. Please contact either Dr. Jinbo Bi ( or Guannan Liang ( for on-going progress.

Contact Jinbo Bi ( for information about this page.