Lecture 4 — Kernel Functions & Norms in Linear Algebra

Lecture 4 — Kernel Functions & Norms in Linear Algebra

Theory • Properties • Examples • Interactive calculators (kernels & norms)

Part A — Kernel Functions

A kernel function k(x, y) computes an inner product between feature-space mappings of x and y without explicitly computing the mapping:

k(x, y) = <φ(x), φ(y)>

The key benefit (the kernel trick) is that many learning algorithms (SVM, kernel ridge, kernel PCA) only need inner products; a kernel lets us operate in a high (possibly infinite) dimensional feature space implicitly and efficiently.

Common kernel functions

KernelFormulaNotes
Lineark(x,y)=x·yEquivalent to no mapping; fast, good baseline.
Polynomialk(x,y)=(α x·y + c)dAllows interactions up to degree d.
RBF / Gaussiank(x,y)=exp(-||x-y||² / (2σ²))Infinite-dimensional; locally sensitive; popular.
Sigmoidk(x,y)=tanh(α x·y + c)Linked to neural nets; not always positive-definite.

Properties required for valid kernels

Example: Polynomial kernel (simple)

Let x=[x₁,x₂], y=[y₁,y₂], choose d=2, c=1:

k(x,y) = (x·y + 1)² = (x₁y₁ + x₂y₂ + 1)²

This equals the dot product in a 6-dimensional feature space consisting of squared and cross terms — but we avoid computing φ(x) explicitly.

Machine learning usage


Part B — Norms in Linear Algebra

A norm is a function that assigns a non-negative length or size to vectors (and matrices). Norms satisfy: positivity, scalability, and triangle inequality.

Vector norms

Matrix norms

Why norms matter in ML

Worked examples

Vector norms

For x=[3,4], ||x||₂ = 5 because sqrt(3²+4²)=5. ||x||₁ = 7. ||x||_∞ = 4.

Matrix norms

For A = [[1,2],[3,4]]:

||A||_F = sqrt(1+4+9+16) = sqrt(30) ≈ 5.477

The spectral norm is the largest singular value (compute via SVD).

Norms and regularization

Ridge regression minimizes ||y - Xw||₂² + λ||w||₂²; the λ||w||₂² term shrinks weights, reducing variance.


Interactive tools

Kernel evaluator

Enter two vectors (space-separated) and choose kernel.

Norm calculator

Enter a vector or small matrix (rows semicolon-separated).