In the limit where you have infinitely many training samples, what you propose (not to split into three sets) is valid.
But imagine a regression problem with pairs (x,y), with x$\in$[0,10]. Imagine the underlying relation is y=2*x. Let us exaggerate for the sake of argument and say you have 3 training data points: {(1,2), (9,18), (5,10)}. Now you train a network to solve the regression problem. Then you train it to make the training error zero, which a very simple network can achieve. But the network will learn a random mapping for the values which are not in the training set. So when you input a new test point, it will give you a nonsense value.
Now the condition in the question is that you have a training set which can perfectly reflect the characteristics of your data. However unless you have infinitely many points in the interval [0,10], you will always have some gaps in your regression function that the network learns. Hence you will always have an over-fitting situation.
As whuber also stated in his comment, another effect is the effect of the noise on the data. Which means, for every point in your training data, you additionally need infinitely many points or each possible value of, say, Gaussian noise, you you can learn to overlook the noise as well. Otherwise, it is the well known over-fitting.