I understand that one can use kernel functions (i.e. radial kernel) to create non-linear decision boundary.
However, there is something with my logic and I am sure there is something that I clearly misunderstood:
I understand that Kernel functions operate in a high-dimensional, implicit feature space without ever computing the coordinates of the data in that space, but rather by simply computing the inner products between the images of all pairs of data in the feature space. This operation is often computationally cheaper than the explicit computation of the coordinates.
Here is where my logic went wrong:
So, I believe $K: \mathbb{R}^N \rightarrow \mathbb{R}$, i.e. it maps input from high-dimensional space to 1-dimensional space. (Not sure if this is correct. )
However, I watched Andrew Ng's lecture videos on SVM, he mentioned that Kernel can also take original data in $\mathbb{R}^1$ and maps it to very high-dimensional feature space $\mathbb{R}^N$.
This becomes a contradiction and is very confusing.
Please correct my misunderstanding. Thanks.