Abstract
Renewed interest in the relationship between artificial and biological neural networks motivates the study of gradient-free methods. Considering the linear regression model with random design, we theoretically analyze in this work the biologically motivated (weight-perturbed) forward gradient scheme that is based on random linear combination of the gradient. If d denotes the number of parameters and k the number of samples, we prove that the mean squared error of this method converges for k≳d2log(d) with rate d2log(d)/k. Compared to the dimension dependence d for stochastic gradient descent, an additional factor dlog(d) occurs.
Original language | English |
---|---|
Article number | 106174 |
Journal | Journal of statistical planning and inference |
Volume | 233 |
DOIs | |
Publication status | Published - Dec 2024 |
Keywords
- Convergence rates
- Estimation
- Gradient descent
- Linear model
- Zeroth-order methods