5

My guess is that neural networks do not work very well in noisy environments, i.e. the lower the signal-to-noise ratio, the worse the result of a neural network, if compared to other statistical modeling tools. Thus, for example, neural networks are good at predicting credit cards frauds, but they get much less exciting results when trying to predict financial markets (very noisy, at least in the short term).

Any theoretical result and/or empirical evidence on that?

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
RockZen
  • 51
  • 2
  • 1
    This is an overly broad question. Can you be more specific? – Arun Jose Oct 28 '16 at 09:46
  • 2
    IMHO, this is not too broad. There should be some generic knowledge about this, after all. – Richard Hardy Oct 28 '16 at 17:42
  • Why would this be the case? Why would other ML approaches be any better in situations where there is substantial idiosyncratic error? – generic_user Oct 28 '16 at 19:33
  • If you overfit a network it will try to follow the noise as well as the signal, but this can happen for spline and polynomial fits too. There are techniques to avoid overfitting. – KAE Oct 02 '18 at 19:38

1 Answers1

1

If you have lots of inputs but only a few that matter, the neural network may try to overfit to that noise. L1 regularization can help in that aspect as it leads to sparse weight vectors, resulting in noisy / unimportant variables having the weights linked to them set to 0 (you need to find a correct regularization strength). You can read L1 Norm Regularization and Sparsity Explained for Dummies by Shi Yan if you want to get an intuitive grasp of how that works.

The reason for using L1 norm to find a sparse solution is due to its special shape. It has spikes that happen to be at sparse points. Using it to touch the solution surface will very likely to find a touch point on a spike tip and thus a sparse solution.

With lots of noise, the training might require more data as you increase the search space and may require larger batch sizes and smaller learning rates to avoid going in the wrong direction. However, adding a little noise can prove beneficial in some cases. Train Neural Networks With Noise to Reduce Overfitting by Jason Brownlee explains how

The addition of noise during the training of a neural network model has a regularization effect and, in turn, improves the robustness of the model. It has been shown to have a similar impact on the loss function as the addition of a penalty term, as in the case of weight regularization methods.

  • Hi, welcome to CV. Please add a reference for your link in case it dies in the future. Please also summarize the content of the link. – Antoine Jul 12 '21 at 14:08
  • I wouldn't say that "it judges useful by setting the weights of the others to 0" is accurate. That's the effect of L1 regularization, but that is far from how the mechanism works. L1 regularization applies a constant penalty based on the L1 norm which 'pushes' weights towards zero uniformly (unlike L2 which penalises larger weights more than smaller ones), while the optimizer attempts to adjust weights from backpropagation (which may pull weights away from zero); this push-pull results in the 'most important' features having non-zero weights. Learning rate and L1 amount will need fine-tuning. – Avelina Jul 12 '21 at 15:41
  • @Avelina, I agree with the push-pull aspect and I didn't try to go into details about how this works. But taking only this into account could result in smaller weights for less important variables, just like L2 but with less punishment on magnitude. The sparsity comes from the "pointiness" of the shape which surface has constant L1 norm, the pointy bits being sparse vectors (Hard to summarize here, please read [L1 Norm Regularization and Sparsity Explained for Dummies](https://blog.mlreview.com/l1-norm-regularization-and-sparsity-explained-for-dummies-5b0e4be3938a). – Jonathan Sands Jul 21 '21 at 11:21