kernels

3.6. kernels#

basic definition
- continuous
- symmetric
- PSD Gram matrix (\(K_n = XX^T\))
list of kernels
kernels wiki: kernel memorizes points then uses dists between points to classify
learning deep kernels
learning data-adaptive kernels
kernels that mimic dl
kernel methods
wavelet support vector machines - kernels using wavelets

4.1 - what is a valid kernel
- in general, most dot-product like things constitute valid kernels
- a function is a kernel iff it is a symmetric, positive definite function
  - this refers to the \(n\) x \(n\) matrix with entries \(f(x_{row}-x_{col})\) being a psd matrix
- a given kernel can have many feature spaces (can construct different feature spaces that yield the same inner products)
4.2 reproducing kernel hilbert space (RKHS) of a kernel
- hilbert space - abstract vector space with (1) an inner product and (2) is complete (i.e. enough limits so calculus works)
- the RKHS is the smallest feature space of a kernel, and can serve as a canonical feature space
  - RKHS - a \(\mathbb K\)-hilbert space that consists of functions mapping from X to \(\mathbb K\)
  - every RKHS has a unique reproducing kernel
  - every kernel has a unique RKHS
- sums/products of kernels also work

interested in top eigenvectors of \(K_n\) and bottom eigenvectors of \(L_n\)
scott and longuet-higgins - embed data in space of top eigenvectors, normalize in that space, and investigate block structure
perona and freeman - 2 clusters by thresholding top eigenvector
shi & malik - normalized cut: threshold second smallest generalize eigenvector of \(L_n\)
similarly we have kernel PCA, spectral dimensionality reduction, and SVMs (which can be viewed as fitting a linear classifier in the eigenspace of \(K_n\))