That post is self-contained and describes a simple proof of the existence of the SVD decomposition.
The norm we consider is the L2 norm: ∥x∥2=x12+…+xn2. For matrices, ∥A∥=max∥x∥=1∥Ax∥.
We start with a lemma:
Lemma If A∈Rm×n, then there exists a unit vector z such that ATAz=σ2z where σ=∥A∥.
Proof.
On the sphere {u∈Rn/∥u∥=1}, the continuous function u↦∥Au∥2 reaches a maximum at some unit vector u1. By definition of ∥A∥, this maximum value is σ=∥A∥. Now consider the differentiable function F:v∈Rn−{0}↦∥Av∥2−σ2∥v∥2 defined on the open set Rn−{0}: one has F(v)=∥v∥2.(A∥v∥2v−σ2≤0), so F has a maximum at u1. Therefore u1 is a critical point and ∂x∂F vanishes there.
Since
∂x∂F=∂x∂(x⊤⋅A⊤⋅A⋅x−σ2⋅x⊤⋅x)=2⋅A⊤⋅A⋅x−2⋅σ2⋅x
one has A⊤Au1=σ2u1.
Theorem (Singular Value Decomposition). Let A be a real m by n matrix, then there exist orthogonal matrices
U∈Rm×m with columns u1,…um, and V∈Rn×n with columns v1,…vn
satisfying
U⊤AV=Σ=diag(σ1,…,σp)∈Rm×n,p=min{m,n}
where σ1≥σ2≥…≥σp≥0.
proof
Assume A=0 (otherwise the proof is immediate).
Using the lemma, there exists a unit vector v1 such that A⊤Av1=σ2v1. Define u1:=σAv1 (observe it is also a unit vector). Together u1,v1 satisfy
Av1A⊤u1=σu1=σv1
This implies that A(v1⊥)⊂u1⊥ (indeed, v⊥v1⇒v⊤v1=0⇒(Av)⊤u1=v⊤A⊤u1=v⊤σv1=0). Find (n−1) vectors v2,…vn to create an orthogonal matrix V with columns the vi, and (m−1) vectors u2,…um to create an orthogonal matrix U.
Now,
U⊤AV=U⊤[Av1∣Av2∣⋯∣Avn]=U⊤[σu1∣Av2∣⋯∣Avn]
Thus
U⊤AV=σ0⋮00⋯B0
The zero entries in the first column are obtained by forming the products of the rows u2⊤,…um⊤ with the column σu1, and the first row is obtained by forming the product
u1⊤[σu1∣Av2∣⋯∣Avn]
remembering that all the Av2,…Avn are orthogonal to u1.
This concludes the proof.
I'm Sylvain. I'm a data scientist based near Paris.
You can see some of my work on GitHub.