Neural network for predicting a non-linear function

Question

I want to build a neural network to predict $f(x)=\exp(-x)$ for every $x$ in the interval $(0, 5)$. I randomly generated the training set uniformly in the given interval with 500K training examples, then we generate the target exactly with $f$. Which model would be best suited for this simple task? I understand this is not generally the best use case for Neural Nets. I have tried a bunch of networks and they do alright. I want to get extremely precise accuracy. Any tips or words of advice?

What is "extremely precise accuracy"? How do you measure it? What numerical quantity of "accuracy" would satisfy you? What network architecture have you tried and what level of "accuracy" did you achieve? Are you familiar with the universal approximation theorem? https://en.wikipedia.org/wiki/Universal_approximation_theorem — Sycorax, Dec 21 '18 at 17:27

Sycorax · Accepted Answer · 2018-12-22T13:58:06.340

3

import keras as K
import numpy as np
DATA_SIZE = 500000
def gen_data():
    x_ = 5 * np.random.rand(DATA_SIZE)
    y_ = np.exp(-x_)
    return x_, y_

x, y = gen_data()
x_test, y_test = gen_data()
model = K.Sequential([
    K.layers.Dense(8, input_shape=(1,)),
    K.layers.Activation("elu"),
    K.layers.Dense(1),
])

sgd = K.optimizers.sgd(0.01, momentum=0.9)
model.compile(optimizer=sgd, loss="mse")


model.fit(x=x,y=y,epochs=1, batch_size=32,validation_data=(x_test,y_test))

A 1-layer with 8 units network trained for 1 epoch achieves validation loss of val_loss: 6.3314e-06

edited Dec 22 '18 at 13:58

answered Dec 21 '18 at 17:54

Sycorax

76,417
20
189
313

Why Elu and not Relu? Could the activation function be designed to match exp(-x)? – John Doe Dec 21 '18 at 18:02
2

elu has nice properties https://arxiv.org/abs/1511.07289 and because dying relu can be a problem https://stats.stackexchange.com/questions/379884/why-cant-a-single-relu-learn-a-relu/379957#379957 if your activation function exactly matches what you want to approximate, then you don't need an activation function or weights or biases or a neural network – Sycorax Dec 21 '18 at 18:03
@Sycroraw it's unclear how that applies here. How would a Relu function exactly match what I want to approximate? – John Doe Dec 21 '18 at 18:22
I didn't say that it did; I thought I was addressing your second question. What is your question? – Sycorax Dec 21 '18 at 18:26
Could you just explain this statement: "if your activation function exactly matches what you want to approximate, then you don't need an activation function or weights or biases or a neural network?" I don't understand what you mean. – John Doe Dec 21 '18 at 18:30
1

@JohnDoe you misunderstood Sycorax's comment. You asked him: "Could the activation function be designed to match exp(-x)?". He told you: " your activation function exactly matches what you want to approximate, then you don't need an activation function or weights or biases or a neural network", i.e., if you design your activation function to match the function you want to approximate (exp(-x)), then you don't need a neural network in the first place! This has nothing to do with ELU or ReLU – DeltaIV Dec 21 '18 at 18:30
@DeltaIV I see. I thought the 2 comments were related. – John Doe Dec 21 '18 at 18:31
1

Sycorax, re: `relu`, I think that the reason you get orders of magnitude better results using `elu` with respect to `relu`, has nothing to do with dying neurons, but it's just because `elu` has an exponential component. I plan to prove this with a few experiments including LeakyReLU (which don't suffer from the dying neurons issue). – DeltaIV Dec 22 '18 at 09:19
1

@DeltaIV That would be an interesting experiment! If you ask & answer your own question, you'll probably garner a great many points. I think you're probably correct, since on $[-5, 0]$, the activation function is essentially the function we want to model, and it's easy to fit the appropriate weights' -- so the real effect is what I hinted at above: the activation matches what we seek to model. – Sycorax Dec 22 '18 at 13:50
"Many points" is probably a bit optimistic :-) but I do hope it will be well received. I will write it in the next days. – DeltaIV Dec 22 '18 at 15:33
@Sycorax I have a bit more spare time now. Do you still think it would be a good idea to ask & answer such a question? – DeltaIV Dec 30 '18 at 18:47
@DeltaIV I’d be interested to read your findings. – Sycorax Dec 30 '18 at 19:41

Neural network for predicting a non-linear function

1 Answers1