Kernel methods start from the idea of projecting data points from the input space into a high-dimensional feature space through a positive-definite feature map \(\phi:\mathcal{X}\rightarrow\mathcal{F}\), in order to make training data easier to regress or classify.
The corresponding kernel function \(k(x,x')=<\phi(x),\phi(x')>_{\mathcal{F}}\) defines an inner product in that feature space and allows to perform the so called kernel trick:
one can avoid the explicit calculation of the feature map, by working only with inner products between training points.